In view of the immense value of Brassica rapa in the fields of agriculture and molecular biology, the multinational Brassica rapa Genome Sequencing Project (BrGSP) was launched in 2003 by five countries. The developing BrGSP has valuable resources for the community, including a reference genetic map and seed BAC sequences. Although the initial B. rapa linkage map served as a reference for the BrGSP, there was ambiguity in reconciling the linkage groups with the ten chromosomes of B. rapa. Consequently, the BrGSP assigned each of the linkage groups to the project members as chromosome substitutes for sequencing.
We identified simple sequence repeat (SSR) motifs in the B. rapa genome with the sequences of seed BACs used for the BrGSP. By testing 749 amplicons containing SSR motifs, we identified polymorphisms that enabled the anchoring of 188 BACs onto the B. rapa reference linkage map consisting of 719 loci in the 10 linkage groups with an average distance of 1.6 cM between adjacent loci. The anchored BAC sequences enabled the identification of 30 blocks of conserved synteny, totaling 534.9 cM in length, between the genomes of B. rapa and Arabidopsis thaliana. Most of these were consistent with previously reported duplication and rearrangement events that differentiate these genomes. However, we were able to identify the collinear regions for seven additional previously uncharacterized sections of the A genome. Integration of the linkage map with the B. rapa cytogenetic map was accomplished by FISH with probes representing 20 BAC clones, along with probes for rDNA and centromeric repeat sequences. This integration enabled unambiguous alignment and orientation of the maps representing the 10 B. rapa chromosomes.
We developed a second generation reference linkage map for B. rapa, which was aligned unambiguously to the B. rapa cytogenetic map. Furthermore, using our data, we confirmed and extended the comparative genome analysis between B. rapa and A. thaliana. This work will serve as a basis for integrating the genetic, physical, and chromosome maps of the BrGSP, as well as for studies on polyploidization, speciation, and genome duplication in the genus Brassica.
The Multinational Brassica rapa Genome Sequencing Project (BrGSP) has developed valuable genomic resources, including BAC libraries, BAC-end sequences, genetic and physical maps, and seed BAC sequences for Brassica rapa. An integrated linkage map between the amphidiploid B. napus and diploid B. rapa will facilitate the rapid transfer of these valuable resources from B. rapa to B. napus (Oilseed rape, Canola).
In this study, we identified over 23,000 simple sequence repeats (SSRs) from 536 sequenced BACs. 890 SSR markers (designated as BrGMS) were developed and used for the construction of an integrated linkage map for the A genome in B. rapa and B. napus. Two hundred and nineteen BrGMS markers were integrated to an existing B. napus linkage map (BnaNZDH). Among these mapped BrGMS markers, 168 were only distributed on the A genome linkage groups (LGs), 18 distrubuted both on the A and C genome LGs, and 33 only distributed on the C genome LGs. Most of the A genome LGs in B. napus were collinear with the homoeologous LGs in B. rapa, although minor inversions or rearrangements occurred on A2 and A9. The mapping of these BAC-specific SSR markers enabled assignment of 161 sequenced B. rapa BACs, as well as the associated BAC contigs to the A genome LGs of B. napus.
The genetic mapping of SSR markers derived from sequenced BACs in B. rapa enabled direct links to be established between the B. napus linkage map and a B. rapa physical map, and thus the assignment of B. rapa BACs and the associated BAC contigs to the B. napus linkage map. This integrated genetic linkage map will facilitate exploitation of the B. rapa annotated genomic resources for gene tagging and map-based cloning in B. napus, and for comparative analysis of the A genome within Brassica species.
Euchromatic regions of the Brassica rapa genome were sequenced and mapped onto the corresponding regions in the Arabidopsis thaliana genome.
Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics.
We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago.
This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution.
Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data.
BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42). It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE), B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker.
BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
The genus Brassica includes the most extensively cultivated vegetable crops worldwide. Investigation of the Brassica genome presents excellent challenges to study plant genome evolution and divergence of gene function associated with polyploidy and genome hybridization. A physical map of the B. rapa genome is a fundamental tool for analysis of Brassica "A" genome structure. Integration of a physical map with an existing genetic map by linking genetic markers and BAC clones in the sequencing pipeline provides a crucial resource for the ongoing genome sequencing effort and assembly of whole genome sequences.
A genome-wide physical map of the B. rapa genome was constructed by the capillary electrophoresis-based fingerprinting of 67,468 Bacterial Artificial Chromosome (BAC) clones using the five restriction enzyme SNaPshot technique. The clones were assembled into contigs by means of FPC v8.5.3. After contig validation and manual editing, the resulting contig assembly consists of 1,428 contigs and is estimated to span 717 Mb in physical length. This map provides 242 anchored contigs on 10 linkage groups to be served as seed points from which to continue bidirectional chromosome extension for genome sequencing.
The map reported here is the first physical map for Brassica "A" genome based on the High Information Content Fingerprinting (HICF) technique. This physical map will serve as a fundamental genomic resource for accelerating genome sequencing, assembly of BAC sequences, and comparative genomics between Brassica genomes. The current build of the B. rapa physical map is available at the B. rapa Genome Project website for the user community.
Brassica rapa (AA) contains very diverse forms which include oleiferous types and many vegetable types. Genome sequence of B. rapa line Chiifu (ssp. pekinensis), a leafy vegetable type, was published in 2011. Using this knowledge, it is important to develop genomic resources for the oleiferous types of B. rapa. This will allow more involved molecular mapping, in-depth study of molecular mechanisms underlying important agronomic traits and introgression of traits from B. rapa to major oilseed crops - B. juncea (AABB) and B. napus (AACC). The study explores the availability of SNPs in RNA-seq generated contigs of three oleiferous lines of B. rapa - Candle (ssp. oleifera, turnip rape), YSPB-24 and Tetra (ssp. trilocularis, Yellow sarson) and their use in genome-wide linkage mapping and specific-region fine mapping using a RIL population between Chiifu and Tetra.
RNA-seq was carried out on the RNA isolated from young inflorescences containing unopened floral buds, floral axis and small leaves, using Illumina paired-end sequencing technology. Sequence assembly was carried out using the Velvet de-novo programme and the assembled contigs were organised against Chiifu gene models, available in the BRAD-CDS database. RNA-seq confirmed the presence of more than 17,000 single-copy gene models described in the BRAD database. The assembled contigs and the BRAD gene models were analyzed for the presence of SSRs and SNPs. While the number of SSRs was limited, more than 0.2 million SNPs were observed between Chiifu and the three oleiferous lines. Assays for SNPs were designed using KASPar technology and tested on a F7-RIL population derived from a Chiifu x Tetra cross. The design of the SNP assays were based on three considerations - the 50 bp flanking region of the SNPs should be strictly similar, the SNP should have a read-depth of ≥7 and no exon/intron junction should be present within the 101 bp target region. Using these criteria, a total of 640 markers (580 for genome-wide mapping and 60 for specific-region mapping) marking as many genes were tested for mapping. Out of 640 markers that were tested, 594 markers could be mapped unambiguously which included 542 markers for genome-wide mapping and 42 markers for fine mapping of the tet-o locus that is involved with the trait tetralocular ovary in the line Tetra.
A large number of SNPs and PSVs are present in the transcriptome of B. rapa lines for genome-wide linkage mapping and specific-region fine mapping. Criteria used for SNP identification delivered markers, more than 93% of which could be successfully mapped to the F7–RIL population of Chiifu x Tetra cross.
Brassica rapa; RNA-seq; Next generation sequencing; Single nucleotide polymorphism (SNP); Paralog specific variation (PSV); Coding DNA Sequences (CDS); KASPar assays
Recent advances, such as the availability of extensive genome survey sequence (GSS)
data and draft physical maps, are radically transforming the means by which we
can dissect Brassica genome structure and systematically relate it to the Arabidopsis
model. Hitherto, our view of the co-linearities between these closely related genomes
had been largely inferred from comparative RFLP data, necessitating substantial
interpolation and expert interpretation. Sequencing of the Brassica rapa genome
by the Multinational Brassica Genome Project will, however, enable an entirely
computational approach to this problem. Meanwhile we have been developing
databases and bioinformatics tools to support our work in Brassica comparative
genomics, including a recently completed draft physical map of B. rapa integrated
with anchor probes derived from the Arabidopsis genome sequence. We are also
exploring new ways to display the emerging Brassica–Arabidopsis sequence homology
data. We have mapped all publicly available Brassica sequences in silico to the
Arabidopsis TIGR v5 genome sequence and published this in the ATIDB database
that uses Generic Genome Browser (GBrowse). This in silico approach potentially
identifies all paralogous sequences and so we colour-code the significance of the
mappings and offer an integrated, real-time multiple alignment tool to partition them
into paralogous groups. The MySQL database driving GBrowse can also be directly
interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods,
facilitating a wide range of data-mining possibilities.
Brassica rapa, which is closely related to
Arabidopsis thaliana, is an important crop and a
model plant for studying genome evolution via
polyploidization. We report the current understanding of the
genome structure of B. rapa and efforts for the
whole-genome sequencing of the species. The tribe
Brassicaceae, which comprises ca. 240 species,
descended from a common hexaploid ancestor with a basic genome
similar to that of Arabidopsis. Chromosome
rearrangements, including fusions and/or fissions, resulted in
the present-day “diploid” Brassica
species with variation in chromosome number and phenotype.
Triplicated genomic segments of B. rapa are
collinear to those of A. thaliana with InDels.
The genome triplication has led to an approximately 1.7-fold
increase in the B. rapa gene number compared to
that of A. thaliana. Repetitive DNA of B.
rapa has also been extensively amplified and has
diverged from that of A. thaliana. For its
whole-genome sequencing, the Brassica rapa Genome
Sequencing Project (BrGSP) consortium has developed suitable
genomic resources and constructed genetic and physical maps.
Ten chromosomes of B. rapa are being allocated to
BrGSP consortium participants, and each chromosome will be
sequenced by a BAC-by-BAC approach. Genome sequencing of
B. rapa will offer a new perspective for plant
biology and evolution in the context of polyploidization.
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
Brassica juncea is an economically important vegetable crop in China, oil crop in India, condiment crop in Europe and selected for canola quality recently in Canada and Australia. B. juncea (2n = 36, AABB) is an allotetraploid derived from interspecific hybridization between B. rapa (2n = 20, AA) and B. nigra (2n = 16, BB), followed by spontaneous chromosome doubling.
Comparative genome analysis by genome survey sequence (GSS) of allopolyploid B. juncea with B. rapa was carried out based on high-throughput sequencing approaches. Over 28.35 Gb of GSS data were used for comparative analysis of B. juncea and B. rapa, producing 45.93% reads mapping to the B. rapa genome with a high ratio of single-end reads. Mapping data suggested more structure variation (SV) in the B. juncea genome than in B. rapa. We detected 2,921,310 single nucleotide polymorphisms (SNPs) with high heterozygosity and 113,368 SVs, including 1-3 bp Indels, between B. juncea and B. rapa. Non-synonymous polymorphisms in glucosinolate biosynthesis genes may account for differences in glucosinolate biosynthesis and glucosinolate components between B. juncea and B. rapa. Furthermore, we identified distinctive vernalization-dependent and photoperiod-dependent flowering pathways coexisting in allopolyploid B. juncea, suggesting contribution of these pathways to adaptation for survival during polyploidization.
Taken together, we proposed that polyploidization has allowed for accelerated evolution of the glucosinolate biosynthesis and flowering pathways in B. juncea that likely permit the phenotypic variation observed in the crop.
Brassica juncea; Comparative genome analysis; Flowering pathway; Genome survey sequencing; Glucosinolate biosynthesis
The complex genome of rapeseed (Brassica napus) is not well understood despite the economic importance of the species. Good knowledge of sequence variation is needed for genetics approaches and breeding purposes. We used a diversity set of B. napus representing eight different germplasm types to sequence genome-wide distributed restriction-site associated DNA (RAD) fragments for polymorphism detection and genotyping.
More than 113,000 RAD clusters with more than 20,000 single nucleotide polymorphisms (SNPs) and 125 insertions/deletions were detected and characterized. About one third of the RAD clusters and polymorphisms mapped to the Brassica rapa reference sequence. An even distribution of RAD clusters and polymorphisms was observed across the B. rapa chromosomes, which suggests that there might be an equal distribution over the Brassica oleracea chromosomes, too. The representation of Gene Ontology (GO) terms for unigenes with RAD clusters and polymorphisms revealed no signature of selection with respect to the distribution of polymorphisms within genes belonging to a specific GO category.
Considering the decreasing costs for next-generation sequencing, the results of our study suggest that RAD sequencing is not only a simple and cost-effective method for high-density polymorphism detection but also an alternative to SNP genotyping from transcriptome sequencing or SNP arrays, even for species with complex genomes such as B. napus.
Brassica napus; Restriction-site associated DNA; Next-generation sequencing; Single nucleotide polymorphism; Genotyping by sequencing; Genetic diversity
A complete genome sequence provides unlimited information in the sequenced organism
as well as in related taxa. According to the guidance of the Multinational Brassica
Genome Project (MBGP), the Korea Brassica Genome Project (KBGP) is sequencing
chromosome 1 (cytogenetically oriented chromosome #1) of Brassica rapa. We
have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH
analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way.
Comparative genome analyses of the EST sequences and sequenced BAC clones from
Brassica chromosome 1 revealed their homeologous partner regions on the Arabidopsis
genome and a syntenic comparative map between Brassica chromosome 1 and
Arabidopsis chromosomes. In silico chromosome walking and clone validation have
been successfully applied to extending sequence contigs based on the comparative
map and BAC end sequences. In addition, we have defined the (peri)centromeric
heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric
retrotransposons. In-depth sequence analyses of five homeologous BAC clones and
an Arabidopsis chromosomal region reveal overall co-linearity, with 82% sequence
similarity. The data indicate that the Brassica genome has undergone triplication and
subsequent gene losses after the divergence of Arabidopsis and Brassica. Based on in-depth
comparative genome analyses, we propose a comparative genomics approach
for conquering the Brassica genome. In 2005 we intend to construct an integrated
physical map, including sequence information from 500 BAC clones and integration
of fingerprinting data and end sequence data of more than 100 000 BAC clones.
The sequences have been submitted to GenBank with accession numbers: 10 204
BAC ends of the KBrH library (CW978640–CW988843); KBrH138P04, AC155338;
KBrH117N09, AC155337; KBrH097M21, AC155348; KBrH093K03, AC155347;
KBrH081N08, AC155346; KBrH080L24, AC155345; KBrH077A05, AC155343;
KBrH020D15, AC155340; KBrH015H17, AC155339; KBrH001H24, AC155335;
KBrH080A08, AC155344; KBrH004D11, AC155341; KBrH117M18, AC146875;
Sequencing of the chloroplast (cp) genome using traditional sequencing methods has been difficult because of its size (>120 kb) and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the cp genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica
rapa accessions with one lane per accession. In total, 246, 362, and 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16, and FT, respectively. Micro-reads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8 or 95.5–99.7% of the B. rapa cp genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of cp genome.
chloroplast genome; sequencing; Solexa sequencing technology; whole cellular DNA; Brassica rapa
The species Brassica rapa (2n=20, AA) is an important vegetable and oilseed crop, and serves as an excellent model for genomic and evolutionary research in Brassica species. With the availability of whole genome sequence of B. rapa, it is essential to further determine the activity of all functional elements of the B. rapa genome and explore the transcriptome on a genome-wide scale. Here, RNA-seq data was employed to provide a genome-wide transcriptional landscape and characterization of the annotated and novel transcripts and alternative splicing events across tissues.
RNA-seq reads were generated using the Illumina platform from six different tissues (root, stem, leaf, flower, silique and callus) of the B. rapa accession Chiifu-401-42, the same line used for whole genome sequencing. First, these data detected the widespread transcription of the B. rapa genome, leading to the identification of numerous novel transcripts and definition of 5'/3' UTRs of known genes. Second, 78.8% of the total annotated genes were detected as expressed and 45.8% were constitutively expressed across all tissues. We further defined several groups of genes: housekeeping genes, tissue-specific expressed genes and co-expressed genes across tissues, which will serve as a valuable repository for future crop functional genomics research. Third, alternative splicing (AS) is estimated to occur in more than 29.4% of intron-containing B. rapa genes, and 65% of them were commonly detected in more than two tissues. Interestingly, genes with high rate of AS were over-represented in GO categories relating to transcriptional regulation and signal transduction, suggesting potential importance of AS for playing regulatory role in these genes. Further, we observed that intron retention (IR) is predominant in the AS events and seems to preferentially occurred in genes with short introns.
The high-resolution RNA-seq analysis provides a global transcriptional landscape as a complement to the B. rapa genome sequence, which will advance our understanding of the dynamics and complexity of the B. rapa transcriptome. The atlas of gene expression in different tissues will be useful for accelerating research on functional genomics and genome evolution in Brassica species.
Brassica rapa; RNA-seq; Alternative splicing; Transcriptome
Brassica species (tribe Brassiceae) belonging to U's triangle—B. rapa (AA), B. nigra (BB), B. oleracea (CC), B. juncea (AABB), B. napus (AACC) and B. carinata (BBCC)—originated via two polyploidization rounds: a U event producing the three allopolyploids, and a more ancient b genome-triplication event giving rise to the A-, B-, and C-genome diploid species. Molecular mapping studies, in situ hybridization, and genome sequencing of B. rapa support the genome triplication origin of tribe Brassiceae, and suggest that these three diploid species diversified from a common hexaploid ancestor. Analysis of plastid DNA has revealed two distinct lineages—Rapa/Oleracea and Nigra—that conflict with hexaploidization as a single event defining the tribe Brassiceae. We analysed an R-block region of A. thaliana present in six copies in B. juncea (AABB), three copies each on A- and B-genomes to study gene fractionation pattern and synonymous base substitution rates (Ks values). Divergence time of paralogues within the A and B genomes and homoeologues between the A and B genomes was estimated. Homoeologous R blocks of the A and B genomes exhibited high gene collinearity and a conserved gene fractionation pattern. The three progenitors of diploid Brassicas were estimated to have diverged approximately 12 mya. Divergence of B. rapa and B. nigra, calculated from plastid gene sequences, was estimated to have occurred approximately 12 mya, coinciding with the divergence of the three genomes participating in the b event. Divergence of B. juncea A and B genome homoeologues was estimated to have taken place around 7 mya. Based on divergence time estimates and the presence of distinct plastid lineages in tribe Brassiceae, it is concluded that at least two independent triplication events involving reciprocal crosses at the time of the b event have given rise to Rapa/Oleracea and Nigra lineages.
Brassica rapa is an important crop species that produces vegetables, oilseed, and fodder. Although many studies reported quantitative trait loci (QTL) mapping, the genes governing most of its economically important traits are still unknown. In this study, we report QTL mapping for morphological and yield component traits in B. rapa and comparative map alignment between B. rapa, B. napus, B. juncea, and Arabidopsis thaliana to identify candidate genes and conserved QTL blocks between them. A total of 95 QTL were identified in different crucifer blocks of the B. rapa genome. Through synteny analysis with A. thaliana, B. rapa candidate genes and intronic and exonic single nucleotide polymorphisms in the parental lines were detected from whole genome resequenced data, a few of which were validated by mapping them to the QTL regions. Semi-quantitative reverse transcriptase PCR analysis showed differences in the expression levels of a few genes in parental lines. Comparative mapping identified five key major evolutionarily conserved crucifer blocks (R, J, F, E, and W) harbouring QTL for morphological and yield components traits between the A, B, and C subgenomes of B. rapa, B. juncea, and B. napus. The information of the identified candidate genes could be used for breeding B. rapa and other related Brassica species.
Brassica rapa; quantitative trait loci (QTL); morphological traits; single nucleotide polymorphism (SNP); conserved genome blocks
Brassica oleracea encompass a family of vegetables and cabbage that are among the most widely cultivated crops. In 2009, the B. oleracea Genome Sequencing Project was launched using next generation sequencing technology. None of the available maps were detailed enough to anchor the sequence scaffolds for the Genome Sequencing Project. This report describes the development of a large number of SSR and SNP markers from the whole genome shotgun sequence data of B. oleracea, and the construction of a high-density genetic linkage map using a double haploid mapping population.
The B. oleracea high-density genetic linkage map that was constructed includes 1,227 markers in nine linkage groups spanning a total of 1197.9 cM with an average of 0.98 cM between adjacent loci. There were 602 SSR markers and 625 SNP markers on the map. The chromosome with the highest number of markers (186) was C03, and the chromosome with smallest number of markers (99) was C09.
This first high-density map allowed the assembled scaffolds to be anchored to pseudochromosomes. The map also provides useful information for positional cloning, molecular breeding, and integration of information of genes and traits in B. oleracea. All the markers on the map will be transferable and could be used for the construction of other genetic maps.
Cabbage; Brassica; Genetic linkage map; SSR; SNP; Genome
Plasmodiophora brassicae, the causal agent of clubroot disease of the Brassica crops, is widespread in the world. Quantitative trait loci (QTLs) for partial resistance to 4 different isolates of P. brassicae (Pb2, Pb4, Pb7, and Pb10) were investigated using a BC1F1 population from a cross between two subspecies of Brassica rapa, i.e. Chinese cabbage inbred line C59-1 as a susceptible recurrent parent and turnip inbred line ECD04 as a resistant donor parent. The BC1F2 families were assessed for resistance under controlled conditions. A linkage map constructed with simple sequence repeats (SSR), unigene-derived microsatellite (UGMS) markers, and specific markers linked to published clubroot resistance (CR) genes of B. rapa was used to perform QTL mapping. A total of 6 QTLs residing in 5 CR QTL regions of the B. rapa chromosomes A01, A03, and A08 were identified to account for 12.2 to 35.2% of the phenotypic variance. Two QTL regions were found to be novel except for 3 QTLs in the respective regions of previously identified Crr1, Crr2, and Crr3. QTL mapping results indicated that 1 QTL region was common for partial resistance to the 2 isolates of Pb2 and Pb7, whereas the others were specific for each isolate. Additionally, synteny analysis between B. rapa and Arabidopsis thaliana revealed that all CR QTL regions were aligned to a single conserved crucifer blocks (U, F, and R) on 3 Arabidopsis chromosomes where 2 CR QTLs were detected in A. thaliana. These results suggest that some common ancestral genomic regions were involved in the evolution of CR genes in B. rapa.
Completion of the sequencing of the Brassica rapa genome enabled us to undertake a genome-wide identification and functional study of the gene families related to the morphological diversity and agronomic traits of Brassica crops. In this study, we identified the auxin response factor (ARF) gene family, which is one of the key regulators of auxin-mediated plant growth and development in the B. rapa genome. A total of 31 ARF genes were identified in the genome. Phylogenetic and evolutionary analyses suggest that ARF genes fell into four major classes and were amplified in the B. rapa genome as a result of a recent whole genome triplication after speciation from Arabidopsis thaliana. Despite its recent hexaploid ancestry, B. rapa includes a relatively small number of ARF genes compared with the 23 members in A. thaliana, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative genomic and mRNA sequencing analyses demonstrated that 27 of the 31 BrARF genes were transcriptionally active, and their expression was affected by either auxin treatment or floral development stage, although 4 genes were inactive, suggesting that the generation and pseudogenization of ARF members are likely to be an ongoing process. This study will provide a fundamental basis for the modification and evolution of the gene family after a polyploidy event, as well as a functional study of ARF genes in a polyploidy crop species.
Electronic supplementary material
The online version of this article (doi:10.1007/s00438-012-0718-4) contains supplementary material, which is available to authorized users.
Brassica rapa; Auxin response factor; Genome organization; mRNA sequencing; Evolution
Anthocyanins are flavonoid pigments that are responsible for purple coloration in the stems and leaves of a variety of plant species. Anthocyaninless (anl) mutants of Brassica rapa fail to produce anthocyanin pigments. In rapid-cycling Brassica rapa, also known as Wisconsin Fast Plants, the anthocyaninless trait, also called non-purple stem, is widely used as a model recessive trait for teaching genetics. Although anthocyanin genes have been mapped in other plants such as Arabidopsis thaliana, the anl locus has not been mapped in any Brassica species.
We tested primer pairs known to amplify microsatellites in Brassicas and identified 37 that amplified a product in rapid-cycling Brassica rapa. We then developed three-generation pedigrees to assess linkage between the microsatellite markers and anl. 22 of the markers that we tested were polymorphic in our crosses. Based on 177 F2 offspring, we identified three markers linked to anl with LOD scores ≥ 5.0, forming a linkage group spanning 46.9 cM. Because one of these markers has been assigned to a known B. rapa linkage group, we can now assign the anl locus to B. rapa linkage group R9.
This study is the first to identify the chromosomal location of an anthocyanin pigment gene among the Brassicas. It also connects a classical mutant frequently used in genetics education with molecular markers and a known chromosomal location.
MicroRNAs (miRNAs) are recently discovered, noncoding, small regulatory RNA molecules that negatively regulate gene expression. Although many miRNAs are identified and validated in many plant species, they remain largely unknown in Brassica rapa (AA 2n =, 20). B. rapa is an important Brassica crop with wide genetic and morphological diversity resulting in several subspecies that are largely grown for vegetables, oilseeds, and fodder crop production. In this study, we identified 186 miRNAs belonging to 55 families in B. rapa by using comparative genomics. The lengths of identified mature and pre-miRNAs ranged from 18 to 22 and 66 to 305 nucleotides, respectively. Comparison of 4 nucleotides revealed that uracil is the predominant base in the first position of B. rapa miRNA, suggesting that it plays an important role in miRNA- mediated gene regulation. Overall, adenine and guanine were predominant in mature miRNAs, while adenine and uracil were predominant in pre-miRNA sequences. One DNA sequence producing both sense and antisense mature miRNAs belonging to the BrMiR 399 family, which differs by 1 nucleotide at the, 20th position, was identified. In silico analyses, using previously established methods, predicted 66 miRNA target mRNAs for 33 miRNA families. The majority of the target genes were transcription factors that regulate plant growth and development, followed by a few target genes that are involved in fatty acid metabolism, glycolysis, biotic and abiotic stresses, and other cellular processes. Northern blot and qRT-PCR analyses of RNA samples prepared from different B. rapa tissues for 17 miRNA families revealed that miRNAs are differentially expressed both quantitatively and qualitatively in different tissues of B. rapa.
Brassicaceae; in silico; Small RNAs
Map-based cloning of quantitative trait loci (QTLs) in polyploidy crop species remains a challenge due to the complexity of their genome structures. QTLs for seed weight in B. napus have been identified, but information on candidate genes for identified QTLs of this important trait is still rare.
In this study, a whole genome genetic linkage map for B. napus was constructed using simple sequence repeat (SSR) markers that covered a genetic distance of 2,126.4 cM with an average distance of 5.36 cM between markers. A procedure was developed to establish colinearity of SSR loci on B. napus with its two progenitor diploid species B. rapa and B. oleracea through extensive bioinformatics analysis. With the aid of B. rapa and B. oleracea genome sequences, the 421 homologous colinear loci deduced from the SSR loci of B. napus were shown to correspond to 398 homologous loci in Arabidopsis thaliana. Through comparative mapping of Arabidopsis and the three Brassica species, 227 homologous genes for seed size/weight were mapped on the B. napus genetic map, establishing the genetic bases for the important agronomic trait in this amphidiploid species. Furthermore, 12 candidate genes underlying 8 QTLs for seed weight were identified, and a gene-specific marker for BnAP2 was developed through molecular cloning using the seed weight/size gene distribution map in B. napus.
Our study showed that it is feasible to identify candidate genes of QTLs using a SSR-based B. napus genetic map through comparative mapping among Arabidopsis and B. napus and its two progenitor species B. rapa and B. oleracea. Identification of candidate genes for seed weight in amphidiploid B. napus will accelerate the process of isolating the mapped QTLs for this important trait, and this approach may be useful for QTL identification of other traits of agronomic significance.
Brassicaceae; Rapeseed; Arabidopsis; Comparative mapping; QTL; Map-based cloning; Seed weight
The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level.
We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events.
Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.
For identification of genes responsible for varietal differences in flowering time and leaf morphological traits, we constructed a linkage map of Brassica rapa DNA markers including 170 EST-based markers, 12 SSR markers, and 59 BAC sequence-based markers, of which 151 are single nucleotide polymorphism (SNP) markers. By BLASTN, 223 markers were shown to have homologous regions in Arabidopsis thaliana, and these homologous loci covered nearly the whole genome of A. thaliana. Synteny analysis between B. rapa and A. thaliana revealed 33 large syntenic regions. Three quantitative trait loci (QTLs) for flowering time were detected. BrFLC1 and BrFLC2 were linked to the QTLs for bolting time, budding time, and flowering time. Three SNPs in the promoter, which may be the cause of low expression of BrFLC2 in the early-flowering parental line, were identified. For leaf lobe depth and leaf hairiness, one major QTL corresponding to a syntenic region containing GIBBERELLIN 20 OXIDASE 3 and one major QTL containing BrGL1, respectively, were detected. Analysis of nucleotide sequences and expression of these genes suggested possible involvement of these genes in leaf morphological traits.
DNA markers; synteny; bolting time; leaf lobe; leaf hairiness
Genome evolution is a continuous process and genomic rearrangement occurs both within and between species. With the sequencing of the Arabidopsis thaliana genome, comparative genetics and genomics offer new insights into plant biology. The genus Brassica offers excellent opportunities with which to compare genomic synteny so as to reveal genome evolution. During a previous genetic analysis of clubroot resistance in Brassica rapa, we identified a genetic region that is highly collinear with Arabidopsis chromosome 4. This region corresponds to a disease resistance gene cluster in the A. thaliana genome. Relying on synteny with Arabidopsis, we fine-mapped the region and found that the location and order of the markers showed good correspondence with those in Arabidopsis. Microsynteny on a physical map indicated an almost parallel correspondence, with a few rearrangements such as inversions and insertions. The results show that this genomic region of Brassica is conserved extensively with that of Arabidopsis and has potential as a disease resistance gene cluster, although the genera diverged 20 million years ago.
microsynteny; genome evolution; genome organization; genomic collinearity; BAC library