The species Brassica rapa (2n=20, AA) is an important vegetable and oilseed crop, and serves as an excellent model for genomic and evolutionary research in Brassica species. With the availability of whole genome sequence of B. rapa, it is essential to further determine the activity of all functional elements of the B. rapa genome and explore the transcriptome on a genome-wide scale. Here, RNA-seq data was employed to provide a genome-wide transcriptional landscape and characterization of the annotated and novel transcripts and alternative splicing events across tissues.
RNA-seq reads were generated using the Illumina platform from six different tissues (root, stem, leaf, flower, silique and callus) of the B. rapa accession Chiifu-401-42, the same line used for whole genome sequencing. First, these data detected the widespread transcription of the B. rapa genome, leading to the identification of numerous novel transcripts and definition of 5'/3' UTRs of known genes. Second, 78.8% of the total annotated genes were detected as expressed and 45.8% were constitutively expressed across all tissues. We further defined several groups of genes: housekeeping genes, tissue-specific expressed genes and co-expressed genes across tissues, which will serve as a valuable repository for future crop functional genomics research. Third, alternative splicing (AS) is estimated to occur in more than 29.4% of intron-containing B. rapa genes, and 65% of them were commonly detected in more than two tissues. Interestingly, genes with high rate of AS were over-represented in GO categories relating to transcriptional regulation and signal transduction, suggesting potential importance of AS for playing regulatory role in these genes. Further, we observed that intron retention (IR) is predominant in the AS events and seems to preferentially occurred in genes with short introns.
The high-resolution RNA-seq analysis provides a global transcriptional landscape as a complement to the B. rapa genome sequence, which will advance our understanding of the dynamics and complexity of the B. rapa transcriptome. The atlas of gene expression in different tissues will be useful for accelerating research on functional genomics and genome evolution in Brassica species.
Brassica rapa; RNA-seq; Alternative splicing; Transcriptome
Euchromatic regions of the Brassica rapa genome were sequenced and mapped onto the corresponding regions in the Arabidopsis thaliana genome.
Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics.
We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago.
This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution.
The Brassicaceae family includes the model plant Arabidopsis thaliana as well as a number of agronomically important species such as oilseed crops (in particular Brassica napus, B. juncea and B. rapa) and vegetables (eg. B. rapa and B. oleracea).
Separated by only 10-20 million years, Brassica species and Arabidopsis thaliana are closely related, and it is expected that knowledge obtained relating to Arabidopsis growth and development can be translated into Brassicas for crop improvement. Moreover, certain aspects of plant development are sufficiently different between Brassica and Arabidopsis to warrant studies to be carried out directly in the crop species. However, mutating individual genes in the amphidiploid Brassicas such as B. napus and B. juncea may, on the other hand, not give rise to expected phenotypes as the genomes of these species can contain up to six orthologues per single-copy Arabidopsis gene. In order to elucidate and possibly exploit the function of redundant genes for oilseed rape crop improvement, it may therefore be more efficient to study the effects in one of the diploid Brassica species such as B. rapa. Moreover, the ongoing sequencing of the B. rapa genome makes this species a highly attractive model for Brassica research and genetic resource development.
Seeds from the diploid Brassica A genome species, B. rapa were treated with ethyl methane sulfonate (EMS) to produce a TILLING (Targeting Induced Local Lesions In Genomes) population for reverse genetics studies. We used the B. rapa genotype, R-o-18, which has a similar developmental ontogeny to an oilseed rape crop. Hence this resource is expected to be well suited for studying traits with relevance to yield and quality of oilseed rape. DNA was isolated from a total of 9,216 M2 plants and pooled to form the basis of the TILLING platform. Analysis of six genes revealed a high level of mutations with a density of about one per 60 kb. This analysis also demonstrated that screening a 1 kb amplicon in just one third of the population (3072 M2 plants) will provide an average of 68 mutations and a 97% probability of obtaining a stop-codon mutation resulting in a truncated protein. We furthermore calculated that each plant contains on average ~10,000 mutations and due to the large number of plants, it is predicted that mutations in approximately half of the GC base pairs in the genome exist within this population.
We have developed the first EMS TILLING resource in the diploid Brassica species, B. rapa. The mutation density in this population is ~1 per 60 kb, which makes it the most densely mutated diploid organism for which a TILLING population has been published. This resource is publicly available through the RevGenUK reverse genetics platform http://revgenuk.jic.ac.uk.
The genus Brassica includes the most extensively cultivated vegetable crops worldwide. Investigation of the Brassica genome presents excellent challenges to study plant genome evolution and divergence of gene function associated with polyploidy and genome hybridization. A physical map of the B. rapa genome is a fundamental tool for analysis of Brassica "A" genome structure. Integration of a physical map with an existing genetic map by linking genetic markers and BAC clones in the sequencing pipeline provides a crucial resource for the ongoing genome sequencing effort and assembly of whole genome sequences.
A genome-wide physical map of the B. rapa genome was constructed by the capillary electrophoresis-based fingerprinting of 67,468 Bacterial Artificial Chromosome (BAC) clones using the five restriction enzyme SNaPshot technique. The clones were assembled into contigs by means of FPC v8.5.3. After contig validation and manual editing, the resulting contig assembly consists of 1,428 contigs and is estimated to span 717 Mb in physical length. This map provides 242 anchored contigs on 10 linkage groups to be served as seed points from which to continue bidirectional chromosome extension for genome sequencing.
The map reported here is the first physical map for Brassica "A" genome based on the High Information Content Fingerprinting (HICF) technique. This physical map will serve as a fundamental genomic resource for accelerating genome sequencing, assembly of BAC sequences, and comparative genomics between Brassica genomes. The current build of the B. rapa physical map is available at the B. rapa Genome Project website for the user community.
Brassica rapa (AA) contains very diverse forms which include oleiferous types and many vegetable types. Genome sequence of B. rapa line Chiifu (ssp. pekinensis), a leafy vegetable type, was published in 2011. Using this knowledge, it is important to develop genomic resources for the oleiferous types of B. rapa. This will allow more involved molecular mapping, in-depth study of molecular mechanisms underlying important agronomic traits and introgression of traits from B. rapa to major oilseed crops - B. juncea (AABB) and B. napus (AACC). The study explores the availability of SNPs in RNA-seq generated contigs of three oleiferous lines of B. rapa - Candle (ssp. oleifera, turnip rape), YSPB-24 and Tetra (ssp. trilocularis, Yellow sarson) and their use in genome-wide linkage mapping and specific-region fine mapping using a RIL population between Chiifu and Tetra.
RNA-seq was carried out on the RNA isolated from young inflorescences containing unopened floral buds, floral axis and small leaves, using Illumina paired-end sequencing technology. Sequence assembly was carried out using the Velvet de-novo programme and the assembled contigs were organised against Chiifu gene models, available in the BRAD-CDS database. RNA-seq confirmed the presence of more than 17,000 single-copy gene models described in the BRAD database. The assembled contigs and the BRAD gene models were analyzed for the presence of SSRs and SNPs. While the number of SSRs was limited, more than 0.2 million SNPs were observed between Chiifu and the three oleiferous lines. Assays for SNPs were designed using KASPar technology and tested on a F7-RIL population derived from a Chiifu x Tetra cross. The design of the SNP assays were based on three considerations - the 50 bp flanking region of the SNPs should be strictly similar, the SNP should have a read-depth of ≥7 and no exon/intron junction should be present within the 101 bp target region. Using these criteria, a total of 640 markers (580 for genome-wide mapping and 60 for specific-region mapping) marking as many genes were tested for mapping. Out of 640 markers that were tested, 594 markers could be mapped unambiguously which included 542 markers for genome-wide mapping and 42 markers for fine mapping of the tet-o locus that is involved with the trait tetralocular ovary in the line Tetra.
A large number of SNPs and PSVs are present in the transcriptome of B. rapa lines for genome-wide linkage mapping and specific-region fine mapping. Criteria used for SNP identification delivered markers, more than 93% of which could be successfully mapped to the F7–RIL population of Chiifu x Tetra cross.
Brassica rapa; RNA-seq; Next generation sequencing; Single nucleotide polymorphism (SNP); Paralog specific variation (PSV); Coding DNA Sequences (CDS); KASPar assays
The Brassica species include an important group of crops and provide opportunities for studying the evolutionary consequences of polyploidy. They are related to Arabidopsis thaliana, for which the first complete plant genome sequence was obtained and their genomes show extensive, although imperfect, conserved synteny with that of A. thaliana. A large number of EST sequences, derived from a range of different Brassica species, are available in the public database, but no public microarray resource has so far been developed for these species.
We assembled unigenes using ~800,000 EST sequences, mainly from three species: B. napus, B. rapa and B. oleracea. The assembly was conducted with the aim of co-assembling ESTs of orthologous genes (including homoeologous pairs of genes in B. napus from each of the A and C genomes), but resolving assemblies of paralogous, or paleo-homoeologous, genes (i.e. the genes related by the ancestral genome triplication observed in diploid Brassica species). 90,864 unique sequence assemblies were developed. These were incorporated into the BAC sequence annotation for the Brassica rapa Genome Sequencing Project, enabling the identification of cognate genomic sequences for a proportion of them. A 60-mer oligo microarray comprising 94,558 probes was developed using the unigene sequences. Gene expression was analysed in reciprocal resynthesised B. napus lines and the B. oleracea and B. rapa lines used to produce them. The analysis showed that significant expression could consistently be detected in leaf tissue for 35,386 unigenes. Expression was detected across all four genotypes for 27,355 unigenes, genome-specific expression patterns were observed for 7,851 unigenes and 180 unigenes displayed other classes of expression pattern. Principal component analysis (PCA) clearly resolved the individual microarray datasets for B. rapa, B. oleracea and resynthesised B. napus. Quantitative differences in expression were observed between the resynthesised B. napus lines for 98 unigenes, most of which could be classified into non-additive expression patterns, including 17 that showed cytoplasm-specific patterns. We further characterized the unigenes for which A genome-specific expression was observed and cognate genomic sequences could be identified. Ten of these unigenes were found to be Brassica-specific sequences, including two that originate from complex loci comprising gene clusters.
We succeeded in developing a Brassica community microarray resource. Although expression can be measured for the majority of unigenes across species, there were numerous probes that reported in a genome-specific manner. We anticipate that some proportion of these will represent species-specific transcripts and the remainder will be the consequence of variation of sequences within the regions represented by the array probes. Our studies demonstrated that the datasets obtained from the arrays can be used for typical analyses, including PCA and the analysis of differential expression. We have also demonstrated that Brassica-specific transcripts identified in silico in the sequence assembly of public EST database accessions are indeed reported by the array. These would not be detectable using arrays designed using A. thaliana sequences.
Brassica rapa is an economically important crop and a model plant for studies concerning polyploidization and the evolution of extreme morphology. The multinational B. rapa Genome Sequencing Project (BrGSP) was launched in 2003. In 2008, next generation sequencing technology was used to sequence the B. rapa genome. Several maps concerning B. rapa pseudochromosome assembly have been published but their coverage of the genome is incomplete, anchoring approximately 73.6% of the scaffolds on to chromosomes. Therefore, a new genetic map to aid pseudochromosome assembly is required.
This study concerns the construction of a reference genetic linkage map for Brassica rapa, forming the backbone for anchoring sequence scaffolds of the B. rapa genome resulting from recent sequencing efforts. One hundred and nineteen doubled haploid (DH) lines derived from microspore cultures of an F1 cross between a Chinese cabbage (B. rapa ssp. pekinensis) DH line (Z16) and a rapid cycling inbred line (L144) were used to construct the linkage map. PCR-based insertion/deletion (InDel) markers were developed by re-sequencing the two parental lines. The map comprises a total of 507 markers including 415 InDels and 92 SSRs. Alignment and orientation using SSR markers in common with existing B. rapa linkage maps allowed ten linkage groups to be identified, designated A01-A10. The total length of the linkage map was 1234.2 cM, with an average distance of 2.43 cM between adjacent marker loci. The lengths of linkage groups ranged from 71.5 cM to 188.5 cM for A08 and A09, respectively. Using the developed linkage map, 152 scaffolds were anchored on to the chromosomes, encompassing more than 82.9% of the B. rapa genome. Taken together with the previously available linkage maps, 183 scaffolds were anchored on to the chromosomes and the total coverage of the genome was 88.9%.
The development of this linkage map is vital for the integration of genome sequences and genetic information, and provides a useful resource for the international Brassica research community.
The species Brassica rapa includes important vegetable and oil crops. It also serves as an excellent model system to study polyploidy-related genome evolution because of its paleohexaploid ancestry and its close evolutionary relationships with Arabidopsis thaliana and other Brassica species with larger genomes. Therefore, its genome sequence will be used to accelerate both basic research on genome evolution and applied research across the cultivated Brassica species.
We have determined and analyzed the sequence of B. rapa chromosome A3. We obtained 31.9 Mb of sequences, organized into nine contigs, which incorporated 348 overlapping BAC clones. Annotation revealed 7,058 protein-coding genes, with an average gene density of 4.6 kb per gene. Analysis of chromosome collinearity with the A. thaliana genome identified conserved synteny blocks encompassing the whole of the B. rapa chromosome A3 and sections of four A. thaliana chromosomes. The frequency of tandem duplication of genes differed between the conserved genome segments in B. rapa and A. thaliana, indicating differential rates of occurrence/retention of such duplicate copies of genes. Analysis of 'ancestral karyotype' genome building blocks enabled the development of a hypothetical model for the derivation of the B. rapa chromosome A3.
We report the near-complete chromosome sequence from a dicotyledonous crop species. This provides an example of the complexity of genome evolution following polyploidy. The high degree of contiguity afforded by the clone-by-clone approach provides a benchmark for the performance of whole genome shotgun approaches presently being applied in B. rapa and other species with complex genomes.
Improving crop species by breeding for salt tolerance or introducing salt tolerant traits is one method of increasing crop yields in saline affected areas. Extensive studies of the model plant species Arabidopsis thaliana has led to the availability of substantial information regarding the function and importance of many genes involved in salt tolerance. However, the identification and characterization of A. thaliana orthologs in species such as Brassica napus (oilseed rape) can prove difficult due to the significant genomic changes that have occurred since their divergence approximately 20 million years ago (MYA). The recently released Brassica rapa genome provides an excellent resource for comparative studies of A. thaliana and the cultivated Brassica species, and facilitates the identification of Brassica species orthologs which may be of agronomic importance. Sodium hydrogen antiporter (NHX) proteins transport a sodium or potassium ion in exchange for a hydrogen ion in the other direction across a membrane. In A. thaliana there are eight members of the NHX family, designated AtNHX1-8, that can be sub-divided into three clades, based on their subcellular localization: plasma membrane (PM), intracellular class I (IC-I) and intracellular class II (IC-II). In plants, many NHX proteins are primary determinants of salt tolerance and act by transporting Na+ out of the cytosol where it would otherwise accumulate to toxic levels. Significant work has been done to determine the role of both PM and IC-I clade members in salt tolerance in a variety of plant species, but relatively little analysis has been described for the IC-II clade. Here we describe the identification of B. napus orthologs of AtNHX5 and AtNHX6, using the B. rapa genome sequence, macro- and micro-synteny analysis, comparative expression and promoter motif analysis, and highlight the value of these multiple approaches for identifying true orthologs in closely related species with multiple paralogs.
Arabidopsis; NHX; antiporter; Brassica; sodium transport; potassium transport; pH; cation transport
Brassica rapa, which is closely related to
Arabidopsis thaliana, is an important crop and a
model plant for studying genome evolution via
polyploidization. We report the current understanding of the
genome structure of B. rapa and efforts for the
whole-genome sequencing of the species. The tribe
Brassicaceae, which comprises ca. 240 species,
descended from a common hexaploid ancestor with a basic genome
similar to that of Arabidopsis. Chromosome
rearrangements, including fusions and/or fissions, resulted in
the present-day “diploid” Brassica
species with variation in chromosome number and phenotype.
Triplicated genomic segments of B. rapa are
collinear to those of A. thaliana with InDels.
The genome triplication has led to an approximately 1.7-fold
increase in the B. rapa gene number compared to
that of A. thaliana. Repetitive DNA of B.
rapa has also been extensively amplified and has
diverged from that of A. thaliana. For its
whole-genome sequencing, the Brassica rapa Genome
Sequencing Project (BrGSP) consortium has developed suitable
genomic resources and constructed genetic and physical maps.
Ten chromosomes of B. rapa are being allocated to
BrGSP consortium participants, and each chromosome will be
sequenced by a BAC-by-BAC approach. Genome sequencing of
B. rapa will offer a new perspective for plant
biology and evolution in the context of polyploidization.
Brassica rapa is an important crop species that produces vegetables, oilseed, and fodder. Although many studies reported quantitative trait loci (QTL) mapping, the genes governing most of its economically important traits are still unknown. In this study, we report QTL mapping for morphological and yield component traits in B. rapa and comparative map alignment between B. rapa, B. napus, B. juncea, and Arabidopsis thaliana to identify candidate genes and conserved QTL blocks between them. A total of 95 QTL were identified in different crucifer blocks of the B. rapa genome. Through synteny analysis with A. thaliana, B. rapa candidate genes and intronic and exonic single nucleotide polymorphisms in the parental lines were detected from whole genome resequenced data, a few of which were validated by mapping them to the QTL regions. Semi-quantitative reverse transcriptase PCR analysis showed differences in the expression levels of a few genes in parental lines. Comparative mapping identified five key major evolutionarily conserved crucifer blocks (R, J, F, E, and W) harbouring QTL for morphological and yield components traits between the A, B, and C subgenomes of B. rapa, B. juncea, and B. napus. The information of the identified candidate genes could be used for breeding B. rapa and other related Brassica species.
Brassica rapa; quantitative trait loci (QTL); morphological traits; single nucleotide polymorphism (SNP); conserved genome blocks
Recent advances, such as the availability of extensive genome survey sequence (GSS)
data and draft physical maps, are radically transforming the means by which we
can dissect Brassica genome structure and systematically relate it to the Arabidopsis
model. Hitherto, our view of the co-linearities between these closely related genomes
had been largely inferred from comparative RFLP data, necessitating substantial
interpolation and expert interpretation. Sequencing of the Brassica rapa genome
by the Multinational Brassica Genome Project will, however, enable an entirely
computational approach to this problem. Meanwhile we have been developing
databases and bioinformatics tools to support our work in Brassica comparative
genomics, including a recently completed draft physical map of B. rapa integrated
with anchor probes derived from the Arabidopsis genome sequence. We are also
exploring new ways to display the emerging Brassica–Arabidopsis sequence homology
data. We have mapped all publicly available Brassica sequences in silico to the
Arabidopsis TIGR v5 genome sequence and published this in the ATIDB database
that uses Generic Genome Browser (GBrowse). This in silico approach potentially
identifies all paralogous sequences and so we colour-code the significance of the
mappings and offer an integrated, real-time multiple alignment tool to partition them
into paralogous groups. The MySQL database driving GBrowse can also be directly
interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods,
facilitating a wide range of data-mining possibilities.
Molecular genetic maps provide a means to link heritable traits with underlying genome sequence variation. Several genetic maps have been constructed for Brassica species, yet to date, there has been no simple means to compare this information or to associate mapped traits with the genome sequence of the related model plant, Arabidopsis.
We have developed a comparative genetic map database for the viewing, comparison and analysis of Brassica and Arabidopsis genetic, physical and trait map information. This web-based tool allows users to view and compare genetic and physical maps, search for traits and markers, and compare genetic linkage groups within and between the amphidiploid and diploid Brassica genomes. The inclusion of Arabidopsis data enables comparison between Brassica maps that share no common markers. Analysis of conserved syntenic blocks between Arabidopsis and collated Brassica genetic maps validates the application of this system. This tool is freely available over the internet on .
This database enables users to interrogate the relationship between Brassica genetic maps and the sequenced genome of A. thaliana, permitting the comparison of genetic linkage groups and mapped traits and the rapid identification of candidate genes.
Sequencing of the chloroplast (cp) genome using traditional sequencing methods has been difficult because of its size (>120 kb) and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the cp genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica
rapa accessions with one lane per accession. In total, 246, 362, and 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16, and FT, respectively. Micro-reads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8 or 95.5–99.7% of the B. rapa cp genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of cp genome.
chloroplast genome; sequencing; Solexa sequencing technology; whole cellular DNA; Brassica rapa
The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level.
We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events.
Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.
As part of a research programme focused on flavonoid biosynthesis in the seed coat of Brassica napus L. (oilseed rape), orthologs of the BANYULS gene that encoded anthocyanidin reductase were cloned in B. napus as well as in the related species Brassica rapa and Brassica oleracea. B. napus genome contained four functional copies of BAN, two originating from each diploid progenitor. Amino acid sequences were highly conserved between the Brassicaceae including B. napus, B. rapa, B. oleracea as well as the model plant Arabidopsis thaliana. Along the 200 bp in 5′ of the ATG codon, Bna.BAN promoters (ProBna.BAN) were conserved with AtANR promoter and contained putative cis-acting elements. In addition, transgenic Arabidopsis and oilseed rape plants carrying the first 230 bp of ProBna.BAN fused to the UidA reporter gene were generated. In the two Brassicaceae backgrounds, ProBna.BAN activity was restricted to the seed coat. In B. napus seed, ProBna.BAN was activated in procyanidin-accumulating cells, namely the innermost layer of the inner integument and the micropyle-chalaza area. At the transcriptional level, the four Bna.BAN genes were expressed in the seed. Laser microdissection assays of the seed integuments showed that Bna.BAN expression was restricted to the inner integument, which was consistent with the activation profile of ProBna.BAN. Finally, Bna.BAN genes were mapped onto oilseed rape genetic maps and potential co-localisations with seed colour quantitative trait loci are discussed.
Anthocyanidin reductase; BANYULS genes; Brassica; Flavonoid metabolism; Seed coat-specific promoter
The genus Brassica (Brassicaceae, Brassiceae) is closely related to the model plant Arabidopsis, and includes several important crop plants. Against the background of ongoing genome sequencing, and in line with efforts to standardize and simplify description of genetic entities, we propose a standard systematic gene nomenclature system for the Brassica genus. This is based upon concatenating abbreviated categories, where these are listed in descending order of significance from left to right (i.e. genus – species – genome – gene name – locus – allele). Indicative examples are provided, and the considerations and recommendations for use are discussed, including outlining the relationship with functionally well-characterized Arabidopsis orthologues. A Brassica Gene Registry has been established under the auspices of the Multinational Brassica Genome Project that will enable management of gene names within the research community, and includes provisional allocation of standard names to genes previously described in the literature or in sequence repositories. The proposed standardization of Brassica gene nomenclature has been distributed to editors of plant and genetics journals and curators of sequence repositories, so that it can be adopted universally.
Polyploidy is an important evolutionary mechanism in flowering plants that often induces immediate extensive changes in gene expression through genomic merging and doubling. Brassica napus L. is one of the most economically important polyploid oil crops and has been broadly studied as an example of polyploid crop. RNA-seq is a recently developed technique for transcriptome study, which could be in choice for profiling gene expression pattern in polyploids.
We examined the global gene expression patterns of the first four generations of resynthesized B. napus (F1–F4), its diploid progenitors B. rapa and B. oleracea, and natural B. napus using digital gene expression analysis. Almost 42 million clean tags were generated using Illumina technology to produce the expression data for 25959 genes, which account for 63% of the annotated B. rapa genome. More than 56% of the genes were transcribed from both strands, which indicate the importance of RNA-mediated gene regulation in polyploidization. Tag mapping of the B. rapa genome generated 19023, 18547, 24383, 20659, 18881, 20692, and 19955 annotated genes for the B. rapa, B. oleracea, F1–F4 of synthesized B. napus, and natural B. napus libraries, respectively. The unambiguous tag-mapped genes in the libraries were functionally categorized via gene ontological analysis. Thousands of differentially expressed genes (DEGs) were identified and revealed the substantial changes in F1–F4. Among the 20 most DEGs are DNA binding/transcription factor, cyclin-dependent protein kinase, epoxycarotenoid dioxygenase, and glycine-rich protein. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the DEGs suggested approximately 120 biological pathways.
The systematic deep sequencing analysis provided a comprehensive understanding of the transcriptome complexity of early generations of synthesized B. napus. This information broadens our understanding of the mechanisms of B. napus polyploidization and contributes to molecular and genetic research by enriching the Brassica database.
Chinese cabbage (Brassica rapa ssp. pekinensis) is a member of one of the most important leaf vegetables grown worldwide, which has experienced thousands of years in cultivation and artificial selection. The entire Chinese cabbage genome sequence, and more than forty thousand proteins have been obtained to date. The genome has undergone triplication events since its divergence from Arabidopsis thaliana (13 to 17 Mya), however a high degree of sequence similarity and conserved genome structure remain between the two species. Arabidopsis is therefore a viable reference species for comparative genomics studies. Variation in the number of members in gene families due to genome triplication may contribute to the broad range of phenotypic plasticity, and increased tolerance to environmental extremes observed in Brassica species. Transcription factors are important regulators involved in plant developmental and physiological processes. The AP2/ERF proteins, one of the most important families of transcriptional regulators, play a crucial role in plant growth, and in response to biotic and abiotic stressors. Our analysis will provide resources for understanding the tolerance mechanisms in Brassica rapa ssp. pekinensis.
In the present study, 291 putative AP2/ERF transcription factor proteins were identified from the Chinese cabbage genome database, and compared with proteins from 15 additional species. The Chinese cabbage AP2/ERF superfamily was classified into four families, including AP2, ERF, RAV, and Soloist. The ERF family was further divided into DREB and ERF subfamilies. The AP2/ERF superfamily was subsequently divided into 15 groups. The identification, classification, phylogenetic reconstruction, conserved motifs, chromosome distribution, functional annotation, expression patterns, and interaction networks of the AP2/ERF transcription factor superfamily were predicted and analyzed. Distribution mapping results showed AP2/ERF superfamily genes were localized on the 10 Chinese cabbage chromosomes. AP2/ERF transcription factor expression levels exhibited differences among six tissue types based on expressed sequence tags (ESTs). In the AP2/ERF superfamily, 214 orthologous genes were identified between Chinese cabbage and Arabidopsis. Orthologous gene interaction networks were constructed, and included seven CBF and four AP2 genes, primarily involved in cold regulatory pathways and ovule development, respectively.
The evolution of the AP2/ERF transcription factor superfamily in Chinese cabbage resulted from genome triplication and tandem duplications. A comprehensive analysis of the physiological functions and biological roles of AP2/ERF superfamily genes in Chinese cabbage is required to fully elucidate AP2/ERF, which provides us with rich resources and opportunities to understand crop stress tolerance mechanisms.
Chinese cabbage; AP2/ERF; Stress tolerance; Gene expression; Interaction network; Protein annotation
The complex genome of rapeseed (Brassica napus) is not well understood despite the economic importance of the species. Good knowledge of sequence variation is needed for genetics approaches and breeding purposes. We used a diversity set of B. napus representing eight different germplasm types to sequence genome-wide distributed restriction-site associated DNA (RAD) fragments for polymorphism detection and genotyping.
More than 113,000 RAD clusters with more than 20,000 single nucleotide polymorphisms (SNPs) and 125 insertions/deletions were detected and characterized. About one third of the RAD clusters and polymorphisms mapped to the Brassica rapa reference sequence. An even distribution of RAD clusters and polymorphisms was observed across the B. rapa chromosomes, which suggests that there might be an equal distribution over the Brassica oleracea chromosomes, too. The representation of Gene Ontology (GO) terms for unigenes with RAD clusters and polymorphisms revealed no signature of selection with respect to the distribution of polymorphisms within genes belonging to a specific GO category.
Considering the decreasing costs for next-generation sequencing, the results of our study suggest that RAD sequencing is not only a simple and cost-effective method for high-density polymorphism detection but also an alternative to SNP genotyping from transcriptome sequencing or SNP arrays, even for species with complex genomes such as B. napus.
Brassica napus; Restriction-site associated DNA; Next-generation sequencing; Single nucleotide polymorphism; Genotyping by sequencing; Genetic diversity
The species Brassica rapa includes various vegetable crops. Production of these vegetable crops is usually impaired by heat stress. Some microRNAs (miRNAs) in Arabidopsis have been considered to mediate gene silencing in plant response to abiotic stress. However, it remains unknown whether or what miRNAs play a role in heat resistance of B. rapa. To identify genomewide conserved and novel miRNAs that are responsive to heat stress in B. rapa, we defined temperature thresholds of non-heading Chinese cabbage (B. rapa ssp. chinensis) and constructed small RNA libraries from the seedlings that had been exposed to high temperature (46 °C) for 1 h. By deep sequencing and data analysis, we selected a series of conserved and novel miRNAs that responded to heat stress. In total, Chinese cabbage shares at least 35 conserved miRNA families with Arabidopsis thaliana. Among them, five miRNA families were responsive to heat stress. Northern hybridization and real-time PCR showed that the conserved miRNAs bra-miR398a and bra-miR398b were heat-inhibitive and guided heat response of their target gene, BracCSD1; and bra-miR156h and bra-miR156g were heat-induced and its putative target BracSPL2 was down-regulated. According to the criteria of miRNA and miRNA* that form a duplex, 21 novel miRNAs belonging to 19 miRNA families were predicted. Of these, four were identified to be heat-responsive by Northern blotting and/or expression analysis of the putative targets. The two novel miRNAs bra-miR1885b.3 and bra-miR5718 negatively regulated their putative target genes. 5′-Rapid amplification of cDNA ends PCR indicated that three novel miRNAs cleaved the transcripts of their target genes where their precursors may have evolved from. These results broaden our perspective on the important role of miRNA in plant responses to heat.
Brassica rapa; heat response; miRNA; small RNA
Genome evolution is a continuous process and genomic rearrangement occurs both within and between species. With the sequencing of the Arabidopsis thaliana genome, comparative genetics and genomics offer new insights into plant biology. The genus Brassica offers excellent opportunities with which to compare genomic synteny so as to reveal genome evolution. During a previous genetic analysis of clubroot resistance in Brassica rapa, we identified a genetic region that is highly collinear with Arabidopsis chromosome 4. This region corresponds to a disease resistance gene cluster in the A. thaliana genome. Relying on synteny with Arabidopsis, we fine-mapped the region and found that the location and order of the markers showed good correspondence with those in Arabidopsis. Microsynteny on a physical map indicated an almost parallel correspondence, with a few rearrangements such as inversions and insertions. The results show that this genomic region of Brassica is conserved extensively with that of Arabidopsis and has potential as a disease resistance gene cluster, although the genera diverged 20 million years ago.
microsynteny; genome evolution; genome organization; genomic collinearity; BAC library
Following successful completion of the Brassica rapa sequencing project, the next step is to investigate functions of individual genes/proteins. For Arabidopsis thaliana, large amounts of protein–protein interaction (PPI) data are available from the major PPI databases (DBs). It is known that Brassica crop species are closely related to A. thaliana. This provides an opportunity to infer the B. rapa interactome using PPI data available from A. thaliana. In this paper, we present an inferred B. rapa interactome that is based on the A. thaliana PPI data from two resources: (i) A. thaliana PPI data from three major DBs, BioGRID, IntAct, and TAIR. (ii) ortholog-based A. thaliana PPI predictions. Linking between B. rapa and A. thaliana was accomplished in three complementary ways: (i) ortholog predictions, (ii) identification of gene duplication based on synteny and collinearity, and (iii) BLAST sequence similarity search. A complementary approach was also applied, which used known/predicted domain–domain interaction data. Specifically, since the two species are closely related, we used PPI data from A. thaliana to predict interacting domains that might be conserved between the two species. The predicted interactome was investigated for the component that contains known A. thaliana meiotic proteins to demonstrate its usability.
Brassica rapa; Arabidopsis thaliana; interactome; protein–protein interaction; domain–domain interaction; meiosis
Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives.
We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting.
Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase.
Brassica oleracea; Database; Genome sequence; Synteny; Comparative genomics
The Multinational Brassica rapa Genome Sequencing Project (BrGSP) has developed valuable genomic resources, including BAC libraries, BAC-end sequences, genetic and physical maps, and seed BAC sequences for Brassica rapa. An integrated linkage map between the amphidiploid B. napus and diploid B. rapa will facilitate the rapid transfer of these valuable resources from B. rapa to B. napus (Oilseed rape, Canola).
In this study, we identified over 23,000 simple sequence repeats (SSRs) from 536 sequenced BACs. 890 SSR markers (designated as BrGMS) were developed and used for the construction of an integrated linkage map for the A genome in B. rapa and B. napus. Two hundred and nineteen BrGMS markers were integrated to an existing B. napus linkage map (BnaNZDH). Among these mapped BrGMS markers, 168 were only distributed on the A genome linkage groups (LGs), 18 distrubuted both on the A and C genome LGs, and 33 only distributed on the C genome LGs. Most of the A genome LGs in B. napus were collinear with the homoeologous LGs in B. rapa, although minor inversions or rearrangements occurred on A2 and A9. The mapping of these BAC-specific SSR markers enabled assignment of 161 sequenced B. rapa BACs, as well as the associated BAC contigs to the A genome LGs of B. napus.
The genetic mapping of SSR markers derived from sequenced BACs in B. rapa enabled direct links to be established between the B. napus linkage map and a B. rapa physical map, and thus the assignment of B. rapa BACs and the associated BAC contigs to the B. napus linkage map. This integrated genetic linkage map will facilitate exploitation of the B. rapa annotated genomic resources for gene tagging and map-based cloning in B. napus, and for comparative analysis of the A genome within Brassica species.