|Home | About | Journals | Submit | Contact Us | Français|
Rice pollen and seed development are directly related to grain yield. To further improve rice yield, it is important for us to functionally annotate the genes controlling pollen/seed development and to use them for rice breeding. Here we first carried out a genome-wide expression analysis with an emphasis on genes being involved in rice pollen and seed development. Based on the transcript profiling, we have identified and functionally classified 82 highly expressed pollen-specific, 12 developing seed-specific and 19 germinating seed-specific genes. We then presented the utilization of the maize transposon Dissociation (Ds) insertion lines for functional genomics of rice pollen and seed development and as alternative germplasm resources for rice breeding. We have established a two-element Activator/Dissociation (Ac/Ds) gene trap tagging system and generated around 20,000 Ds insertion lines. We have subjected these lines for screens to obtain high and low yield Ds insertion lines. Some interesting lines have been obtained with higher yield or male sterility. Flanking Sequence Tags (FSTs) analyses showed that these Ds-tagged genes encoded various proteins including transcription factors, transport proteins, unknown functional proteins and so on. They exhibited diversified expression patterns. Our results suggested that rice could be improved not only by introducing foreign genes but also by knocking out its endogenous genes. This finding might provide a new way for rice breeder to further improve rice varieties.
Rice is one of the main staples in the world and is cultivated mainly in Asia, Africa, and Latin America, accounting for 50-80% of the daily diet of approximately half the world's population. With the growing population and decreasing cultivable land, it is estimated that 40% more rice has to be produced in 2030 1. Therefore, it is important for us to genome-widely identify and functionally characterize rice genes related to rice yield traits. Pollen and/or seed development is directly related to grain production. Among the yield-related genes, many of them have been identified to be related to pollen or grain development and knockout of pollen/seed-specific genes might lead to male sterility or abnormal seed development, thus, reducing seed yield 2-7. Therefore, genome-wide identification of pollen/seed-specifically expressed genes could significantly contribute to the better understanding of biological mechanisms controlling grain yield, which is also a prerequisite for the functional genomics of rice pollen and seed development.
On the other hand, it is compulsory for rice breeders to develop new rice varieties with higher yield. Two ways may contribute to producing more rice seeds by breeding strategy. One of them is to develop new conventional rice varieties and another one is to develop new hybrid rice. However, elite germplasms are a prerequisite for both breeding strategies. In rice, three kinds of elite germplasm resources have been significantly contributed to higher rice yield. One of them is the utilization of dwarf germplasm Dee-geo-woo-gen from China and the release of rice variety IR8, which was developed from the dwarf line 8. The second type of important germplasm is the cytoplasmic male sterile (CMS) rice lines and their restorers of fertility, which are widely utilized to produce hybrid rice seeds by three-line hybrid rice technology because they eliminate the need for laborious hand emasculations 9. More than 20% yield advantage over improved conventional varieties has been achieved by releasing of various three-line hybrid rice combinations 10. The third type of important germplasm is photoperiod (temperature)-sensitive male sterile lines. These lines can be used to produce hybrid rice seeds by two-line hybrid rice technology since these lines can be used not only as male-sterile lines but also as maintainer lines. Thus, the heterosis between subspecies can be used and higher yield can be achieved 9. The yield advantage of two-line hybrid rice is 5-10% higher than that of the existing three-line hybrid rice 10. Therefore, it is very important for us to develop these elite germplasms.
A useful germplasm can be developed from a natural mutation population, which is important for current breeding. Besides natural variation, physical or chemical mutation is also an efficient method to produce diverse variations for crop breeding 11. More than 2250 varieties have been released that were developed from direct mutants or their progenies 12. Variations from various tissue cultures also contributed to the collection of germplasm resources and a few cultivars have been developed including rice 13, 14. On the other hand, transgenic techniques have been employed to produce various germplasm improved in specific traits 15. However, in order to face severe challenges of rapid population growth and reduced cropland area, it is necessary to develop new ways to produce more elite germplasms for development of higher-yield varieties in rice breeding.
With the complete of both japonica and indica rice genome sequences 16, 17, assigning a function to unknown or predicted genes has become the major work of functional genomics. Knockout of a gene is a direct way to achieve this purpose. Insertion mutagenesis with either T-DNA or transposon has been successfully used in functional genomics of plants 18. Various insertion mutation populations have been produced 19-26. Therefore, it is very important how to use these resources for the functional annotations of rice genes. Among them, we are interested in these genes related to rice pollen and seed development since they may contribute to the higher rice yield by molecular breeding. In this study, we first identify the highly expressed pollen/seed-specific genes based on genome-wide transcript profiling in rice. On the other hand, since we have generated around 20,000 rice insertion lines using maize Ac/Ds transposon system 27, we also investigated the variations in their pollen and seed development among these insertion lines. We then analyzed their flanking sequence tags (FSTs) to anchor the genes with Ds insertions. We also carried out the expression analyses to better understand their functions in pollen and seed development. Finally, we evaluated these lines in their potential in rice breeding.
Japonica rice (Oryza sativa, cv. Nipponbare) was used for all of our experiments. Both wild-type (WT) and mutant plants were grown under both greenhouse and natural field conditions.
Establishment of Ac/Ds tagging system was carried out according to Kolesnik et al. (2004) 27. Homozygous lines at the sixth generation were used to screen for grain yield, tolerance / resistance or more sensitivity to biotic stresses. Screens for lines with higher or lower yield were carried out in both Singapore and China. In Singapore, only small scale of screenings was conducted with 12 plants for each line under greenhouse with natural light and temperature conditions. In china, field trials were carried out with around 300 (30 cm × 10 cm) individuals for each line according to the description by Jiang et al. (2007) 28. The investigated agronomic traits included seed weight per plant and seed setting rate per plant according to the standard evaluation system for rice available from International Rice Research Institute (IRRI) resource (http://www.knowledgebank.irri.org/extension/index.php/ses). To evaluate viability, pollen grains were stained with 1% Iodine Potassium Iodide (I2/KI) solution. The I2/KI solution is widely used for staining starch and the starch content in pollen grains can serve as an indicator of viability (29).
Sequences flanking Ds element of the putative candidate lines obtained from various screens were amplified by TAIL-PCR (Thermal Asymmetric Interlaced-PCR) as described by Liu et al. (1995) 30. Tagged genes were obtained via BLAST searches by submitting Ds flanking sequences to TIGR (The Institute of Genome Research, http://tigrblast.tigr.org/euk-blast/index.cgi?project=osa1) databases. The locus numbers were retrieved from TIGR database and were used for searching expression patterns of tagged genes through public plant MPSS (Massively Parallel Signature Sequencing) database (31; http://mpss.udel.edu/). Based on the database, MPSS identifies short sequence signatures produced from a defined position within an mRNA, and the relative abundance of these signatures in a given library represents a quantitative estimate of expression of that gene; the MPSS signatures are 17 or 20 bp in length, and can uniquely identify >95% of all genes in rice 31. Signatures are normalized to transcripts per million (TPM) to facilitate comparisons among different tissues or under different treatments. Based on default setting in the database, a summary of signatures from class 1 (TPM was detected using probes inside annotated open reading frame, ORF), class 2 (within 500 bp 3' of annotated ORF), class 5 (within annotated intron, sense strand) and class 7 (spans intron splice site) was used for expression analyses.
Genome-wide expression of rice genes were also carried out by analyzing the expression data from the rice MPSS database (32; http://mpss.udel.edu/rice/). In the database, expression data from total of 70 different libraries are available. We are interested in 7 libraries. One of them is from mature pollens and the second is from germinating seeds. The remaining 5 libraries are from developing seeds. Expression data from all libraries were downloaded from the MPSS database. The expression abundance from the 7 selected libraries was compared with those from the remaining libraries. Pollen/seed-specific genes were identified if the transcript abundance in the pollen/seed libraries were higher than the sum of the remaining libraries. The highly expressed pollen/seed-specific genes were selected if their expression abundance was higher than 1000 TPM, as determined by the MPSS database.
GO annotations for rice pollen/seed-specific genes were downloaded from the the TIGR rice genome annotation database (now moved to Michigan State University (MSU); http://rice.plantbiology.msu.edu/; 33, 34). We used plant-related GO Slim terms 35 to explore the functions of these genes. Each gene can be associated with several GO Slim terms in the molecular function (MF), cellular component (CC) and biological process (BP) GO functional categories 36. We studied each GO Slim term category independently.
To identify and characterize the genes related to rice pollen/seed development, we first investigated the genome-wide expression profiling. The investigation was carried out by using the publicly available expression database, the MPSS database. The database contains the expression data from 70 libraries constructed from various developmental stages of tissues. Based on our analyses, we have identified more than 1000 putative pollen-specific genes. These genes were preferentially expressed in pollens as their expression abundance was higher than the sum of the remaining tissues. Among them, we have identified 82 highly expressed pollen-specific genes with more than 1000 TPM in their expression abundance. The MSU locus names of these genes and their transcript abundance have been listed in Table Table1.1. Similarly, we have identified 12 developing seed-specific and 19 germinating seed-specific genes with highly expressed signals at developing and germinating seeds, respectively (Table (Table22 and and3).3). We have identified relatively low numbers of seed-specific genes due to that genes with low expression abundance were not included in our analyses.
To examine the functional specificities of these pollen/seed-specific genes, we identified Gene Ontology (GO) terms (Materials and methods). For each term, we identified GO-slim terms in three categories: MF, BP and CC. For pollen-specific genes, half of them contain no GO annotation in all three categories (Table (Table4).4). We have listed GO slim terms for all remaining genes as shown in the table. Various biological functions have been assigned by these pollen-specific genes, suggesting that pollen development is a complex biological process being involved in genes and their proteins located in various cellular components with different molecular functions. Similarly, we have identified GO terms of seed-specific genes. The analysis shows that no GO term can be assigned for two-third of these genes (Table (Table5).5). The remaining seed-specific genes have been involved in various biological processes and their proteins were located different cellular components with multiple molecular functions (Table (Table55).
Pollen and seed development is a very comprehensive process. It is involved in not only pollen/seed-specific genes but also other genes with different expression patterns. On the other hand, besides these genes related to pollen/seed development, other genes may also contribute to the improvement of the rice yield traits. In order to dissect the genes contributing to the higher yield, other strategies have to be employed. In the following section, we reported the establishment of Ac/Ds gene tagging system and its application on gene function annotation and rice breeding.
We have developed an efficient two-element maize Ac/Ds gene trap system 27. In this system, we use two different parental lines for crossing. One parental line is transgenic Ac plant, in which the transposase is immobilized and provides only Ac transposase under the control of 35S promoter. Another parental line is transgenic Ds plant, in which only two wings of Ds element (5' Ds and 3' Ds) is present. This element carries a bar gene conferring resistance to herbicide Basta, which serves as a positive selection (transposition) marker. The promoter-less gusA gene encodes a β-glucuronidase as a reporter gene (expression marker) for detecting gene expression patterns of tagged rice genes (gene trap). The green fluorescence protein (gfp) under maize ubiquitin promoter serves as a negative selection marker in both the Ac (to obtain stable transposants) and Ds (to enrich unlinked transposants) plants. In these starter lines both Ac and Ds elements are incapable of transposition. The Ds element can be mobilized and transposed into different genome positions in F1 generation after crossing with Ac plants. The F2 seeds are generated from these F1 plants by self-crosses. The putative unlinked, stable transposants can be selected by screening the GFP negative and Basta resistant F2 seedlings. Since Ac locus also contains the GFP as a negative selection marker, the GFP negative F2 seedlings will be stable. These stable Ds lines were self-crossed to obtain F3, F4 and F5 generation to obtain homozygous lines and to screen for phenotype.
Our general goal is to generate a large numbers of Ds insertional lines using this system. The large numbers of Ds lines are used to screen for various agriculturally important phenotypes for commercial release. Now we have generated more than 20,000 Ds insertional lines of which around 18,000 are homozygous. In addition to this more than 3,000 Ds flanking sequences were obtained from their corresponding lines.
Until now, we have carried out two batches of high yield screens. In the preliminary screen, only around 10 plants for each line were used for the screen since around 20,000 Ds lines were subjected to the screen. Among all screened Ds lines, yield data were collected from 16,700 Ds lines. These lines showed significantly difference in their yield performance (Fig. (Fig.1),1), suggesting that Ds lines might be used for yield screening and subsequent breeding practice. Most of lines have similar yield while compared with WT plants. The average yield of these Ds lines is 14.1 gram per plant, similar to WT plant. Some of them showed lower yield and the remaining lines exhibited higher yield (Fig. (Fig.1).1). Based on this screen, 343 lines were selected with higher yield. These lines were subjected to further screening. In this screen, around 300 plants per line were planted. The data were presented in Fig. Fig.2.2. In this screen, most of Ds lines showed higher yield compared with WT. The average yield also increases to 16.1 gram per plants. Based on these screen, we have selected 288 Ds lines with more than 50% higher yield than WT (Table (Table66).
Phenotype investigation showed that, besides the higher grain yield, these Ds lines also exhibited additional agronomic traits such as slightly delayed growth period, higher tiller numbers and more strongly growth. We have also observed some higher yield Ds lines with slightly shorter growth period with erected leaf structure. Among these higher yield Ds lines, one of them was displayed in Fig. Fig.3.3. This Ds line showed very strong and nearly double size of its original WT plant (Fig. (Fig.3A).3A). As a result, the mutant produced 25% more seeds compared to WT. Cross-section of tillers showed less starch and more fibers in the mutant (Fig. (Fig.3B).3B). Therefore, the mutant can be used for multiple purposes, not only for harvesting seeds but also collecting rice straw for other purposes for example for paper making or the artificial culture of eatable fungi. This mutant contained no Ds element and the footprint retained by Ds remobilization is the cause for the phenotype without the presence of Ds element. This approach of remobilizing the Ds element from the exons of genes may result in plants with mutant phenotype with no Ds element.
Among 288 Ds lines, DNA samples from some Ds lines were subjected to TAIL-PCR 30 to obtain their flanking sequences tagged by Ds element. Flanking sequence analysis revealed that knockout of many rice genes by Ds insertion may contribute to higher yield. These genes included those that encoded various transcription factors, sucrose transport proteins, hormone regulated proteins, and so on. Some representative genes were listed in Fig. Fig.4A.4A. These genes included those encoding ATP binding protein, amino acid selective channel protein, receptor-like protein kinase, 26S proteasome non-ATPase regulatory subunit 3, lipid binding protein, protein phosphatase type 2A regulator. Furthermore, the expression data from rice MPSS database were used to analyze the expression patterns of these genes using TIGR locus numbers. Totally, data from 11 different tissues were retrieved and analyzed for evaluating their expression patterns including young leaf, mature leaf, young and mature root, stem, merismatic tissue, immature panicle, ovary and mature stigma, mature pollen, developing seed, and germinating seed. These analyses showed that the candidate genes for high yield phenotype were expressed in different tissues (Fig. (Fig.4B).4B). They were sometimes detected in multiple tissues with varying expression levels. Such expression patterns were observed in those genes including LOC_Os09g37000 and LOC_Os03g58940 (Fig. (Fig.4B).4B). However, sometimes, they were expressed in some specific tissues. For example, LOC_Os05g51070 was mainly expressed in young and mature leaves; both LOC_Os05g02060 and LOC_Os02g33630 were mainly expressed in developing seeds; and LOC_Os05g42210 was mainly expressed in merismatic tissues (Fig. (Fig.4B).4B). These diversified expression patterns suggested that high yield might be controlled by multiple genes with various pathways.
Besides the screen of rice high yield Ds insertion lines, the low yield screening was also carried out with the same population of Ds insertion lines. Among the 16,700 Ds insertion lines, we found that at least 436 lines showed low yield phenotype with at least 25% less in their grain yield (Table (Table6).6). Among them, some Ds lines exhibited completely sterility. Further characterization was carried out for these lines. The investigation showed that some of them were completely male sterile with 100% inviability of mature pollens (Fig. (Fig.5A).5A). Another type of sterility is the lack of mature pollens at the flowering stage (Fig. (Fig.5B).5B). In addition to these, more than 500 lines did not segregate homozygous plants even in F5/F6 generation, indicating that these could be putative homozygous lethal lines propagated only as heterozygous plants. Interestingly, we have observed another type of sterility, ie. photoperiod sensitive male sterility. This line exhibited male sterility under short day length conditions and the sterility was recovered under long day length conditions 37. Therefore this line may be useful for developing two-line hybrid rice varieties.
In this manuscript, we have genome-widely identified 82 highly expressed pollen-specific, 12 developing seed-specific and 19 germinating seed-specific genes. Recently, Fujita et al (2010) and Wei et al (2010) also reported the expression atlas of rice genes in reproductive developmental stage and they identified more genes specifically expressed in the stage 38, 39. This may be due to the difference in the employed methods and the criteria used for identifying the tissue-specific genes. On the other hand, our expression analysis may also provide the basis to screen pollen/seed specific promoters, which should be useful for engineering genetically modified rice varieties. In fact, some of seed-specific promoters such as the promoters from some glutelin genes have been used for the production of transgenic rice. For example, Akama et al (2009) have employed the seed-specific promoter GluB-1 to produce gamma-aminobutyric acid (GABA) enriched rice grains that influence a decrease in blood pressure 40. One of seed-specific promoters has also been used for exploring the potential in producing rice seed-based edible vaccines 41.
Grain yield in rice is a complex trait multiplicatively determined by its three component traits: number of panicles, number of grains per panicle, and grain weight; all of which are typical quantitative traits 42. Grain yield will be decreased if pollens/seeds can not be properly developed since viable pollen, receptive stigma and well developed ovule are required for successful seed set in rice. Transcript profiling of pollen/seed development will significantly contribute to the identification of genes for grain yield 43. Not all pollen/seed-specific genes may directly contribute to grain production. However, some of them have been proven to be related to grain yield as shown in this study. Thus, our study may provide some information for further improving grain yield by genetically modifying pollen/seed related genes.
Currently, several yield-related genes have been isolated 44-49. However, only a few of them have been functionally characterized. Since considerable pollen/seed development related genes may contribute to grain production, studies on functional genomics of rice pollen/seed development may speed up the identification of yield-related genes. Since we have identified a batch of pollen/seed-specific genes, these genes can be used for reverse genetics screening to obtain corresponding Ds insertion lines. Thus, their biological functions can be annotated by characterizing these Ds lines. On the other hand, since we have identified several hundreds of Ds insertion lines with changed grain production, yield related genes could be identified and annotated from these Ds tagged lines. Upon the identification and functional characterization, these yield related genes will be employed to further improve rice yield by over-expressing or suppressing these genes through marker-free transgenic strategies 28. In the mean time, tagged Ds lines may be directly used for developing non-transgenic rice varieties with higher yield according to our strategies 28.
We thank Drs. Ildiko Szeverenyi, Tatiana Kolesnik and Doris Bachmann for their help in generation of transposant lines. We take this opportunity to thank Zhigang Ma, Rengasamy Ramamoorthy and Hongfen Luan for their technical assistance.