In view of the immense value of Brassica rapa in the fields of agriculture and molecular biology, the multinational Brassica rapa Genome Sequencing Project (BrGSP) was launched in 2003 by five countries. The developing BrGSP has valuable resources for the community, including a reference genetic map and seed BAC sequences. Although the initial B. rapa linkage map served as a reference for the BrGSP, there was ambiguity in reconciling the linkage groups with the ten chromosomes of B. rapa. Consequently, the BrGSP assigned each of the linkage groups to the project members as chromosome substitutes for sequencing.
We identified simple sequence repeat (SSR) motifs in the B. rapa genome with the sequences of seed BACs used for the BrGSP. By testing 749 amplicons containing SSR motifs, we identified polymorphisms that enabled the anchoring of 188 BACs onto the B. rapa reference linkage map consisting of 719 loci in the 10 linkage groups with an average distance of 1.6 cM between adjacent loci. The anchored BAC sequences enabled the identification of 30 blocks of conserved synteny, totaling 534.9 cM in length, between the genomes of B. rapa and Arabidopsis thaliana. Most of these were consistent with previously reported duplication and rearrangement events that differentiate these genomes. However, we were able to identify the collinear regions for seven additional previously uncharacterized sections of the A genome. Integration of the linkage map with the B. rapa cytogenetic map was accomplished by FISH with probes representing 20 BAC clones, along with probes for rDNA and centromeric repeat sequences. This integration enabled unambiguous alignment and orientation of the maps representing the 10 B. rapa chromosomes.
We developed a second generation reference linkage map for B. rapa, which was aligned unambiguously to the B. rapa cytogenetic map. Furthermore, using our data, we confirmed and extended the comparative genome analysis between B. rapa and A. thaliana. This work will serve as a basis for integrating the genetic, physical, and chromosome maps of the BrGSP, as well as for studies on polyploidization, speciation, and genome duplication in the genus Brassica.
The Multinational Brassica rapa Genome Sequencing Project (BrGSP) has developed valuable genomic resources, including BAC libraries, BAC-end sequences, genetic and physical maps, and seed BAC sequences for Brassica rapa. An integrated linkage map between the amphidiploid B. napus and diploid B. rapa will facilitate the rapid transfer of these valuable resources from B. rapa to B. napus (Oilseed rape, Canola).
In this study, we identified over 23,000 simple sequence repeats (SSRs) from 536 sequenced BACs. 890 SSR markers (designated as BrGMS) were developed and used for the construction of an integrated linkage map for the A genome in B. rapa and B. napus. Two hundred and nineteen BrGMS markers were integrated to an existing B. napus linkage map (BnaNZDH). Among these mapped BrGMS markers, 168 were only distributed on the A genome linkage groups (LGs), 18 distrubuted both on the A and C genome LGs, and 33 only distributed on the C genome LGs. Most of the A genome LGs in B. napus were collinear with the homoeologous LGs in B. rapa, although minor inversions or rearrangements occurred on A2 and A9. The mapping of these BAC-specific SSR markers enabled assignment of 161 sequenced B. rapa BACs, as well as the associated BAC contigs to the A genome LGs of B. napus.
The genetic mapping of SSR markers derived from sequenced BACs in B. rapa enabled direct links to be established between the B. napus linkage map and a B. rapa physical map, and thus the assignment of B. rapa BACs and the associated BAC contigs to the B. napus linkage map. This integrated genetic linkage map will facilitate exploitation of the B. rapa annotated genomic resources for gene tagging and map-based cloning in B. napus, and for comparative analysis of the A genome within Brassica species.
Euchromatic regions of the Brassica rapa genome were sequenced and mapped onto the corresponding regions in the Arabidopsis thaliana genome.
Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics.
We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago.
This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution.
Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data.
BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42). It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE), B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker.
BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
Brassica rapa (AA) contains very diverse forms which include oleiferous types and many vegetable types. Genome sequence of B. rapa line Chiifu (ssp. pekinensis), a leafy vegetable type, was published in 2011. Using this knowledge, it is important to develop genomic resources for the oleiferous types of B. rapa. This will allow more involved molecular mapping, in-depth study of molecular mechanisms underlying important agronomic traits and introgression of traits from B. rapa to major oilseed crops - B. juncea (AABB) and B. napus (AACC). The study explores the availability of SNPs in RNA-seq generated contigs of three oleiferous lines of B. rapa - Candle (ssp. oleifera, turnip rape), YSPB-24 and Tetra (ssp. trilocularis, Yellow sarson) and their use in genome-wide linkage mapping and specific-region fine mapping using a RIL population between Chiifu and Tetra.
RNA-seq was carried out on the RNA isolated from young inflorescences containing unopened floral buds, floral axis and small leaves, using Illumina paired-end sequencing technology. Sequence assembly was carried out using the Velvet de-novo programme and the assembled contigs were organised against Chiifu gene models, available in the BRAD-CDS database. RNA-seq confirmed the presence of more than 17,000 single-copy gene models described in the BRAD database. The assembled contigs and the BRAD gene models were analyzed for the presence of SSRs and SNPs. While the number of SSRs was limited, more than 0.2 million SNPs were observed between Chiifu and the three oleiferous lines. Assays for SNPs were designed using KASPar technology and tested on a F7-RIL population derived from a Chiifu x Tetra cross. The design of the SNP assays were based on three considerations - the 50 bp flanking region of the SNPs should be strictly similar, the SNP should have a read-depth of ≥7 and no exon/intron junction should be present within the 101 bp target region. Using these criteria, a total of 640 markers (580 for genome-wide mapping and 60 for specific-region mapping) marking as many genes were tested for mapping. Out of 640 markers that were tested, 594 markers could be mapped unambiguously which included 542 markers for genome-wide mapping and 42 markers for fine mapping of the tet-o locus that is involved with the trait tetralocular ovary in the line Tetra.
A large number of SNPs and PSVs are present in the transcriptome of B. rapa lines for genome-wide linkage mapping and specific-region fine mapping. Criteria used for SNP identification delivered markers, more than 93% of which could be successfully mapped to the F7–RIL population of Chiifu x Tetra cross.
Brassica rapa; RNA-seq; Next generation sequencing; Single nucleotide polymorphism (SNP); Paralog specific variation (PSV); Coding DNA Sequences (CDS); KASPar assays
The genus Brassica includes the most extensively cultivated vegetable crops worldwide. Investigation of the Brassica genome presents excellent challenges to study plant genome evolution and divergence of gene function associated with polyploidy and genome hybridization. A physical map of the B. rapa genome is a fundamental tool for analysis of Brassica "A" genome structure. Integration of a physical map with an existing genetic map by linking genetic markers and BAC clones in the sequencing pipeline provides a crucial resource for the ongoing genome sequencing effort and assembly of whole genome sequences.
A genome-wide physical map of the B. rapa genome was constructed by the capillary electrophoresis-based fingerprinting of 67,468 Bacterial Artificial Chromosome (BAC) clones using the five restriction enzyme SNaPshot technique. The clones were assembled into contigs by means of FPC v8.5.3. After contig validation and manual editing, the resulting contig assembly consists of 1,428 contigs and is estimated to span 717 Mb in physical length. This map provides 242 anchored contigs on 10 linkage groups to be served as seed points from which to continue bidirectional chromosome extension for genome sequencing.
The map reported here is the first physical map for Brassica "A" genome based on the High Information Content Fingerprinting (HICF) technique. This physical map will serve as a fundamental genomic resource for accelerating genome sequencing, assembly of BAC sequences, and comparative genomics between Brassica genomes. The current build of the B. rapa physical map is available at the B. rapa Genome Project website for the user community.
Recent advances, such as the availability of extensive genome survey sequence (GSS)
data and draft physical maps, are radically transforming the means by which we
can dissect Brassica genome structure and systematically relate it to the Arabidopsis
model. Hitherto, our view of the co-linearities between these closely related genomes
had been largely inferred from comparative RFLP data, necessitating substantial
interpolation and expert interpretation. Sequencing of the Brassica rapa genome
by the Multinational Brassica Genome Project will, however, enable an entirely
computational approach to this problem. Meanwhile we have been developing
databases and bioinformatics tools to support our work in Brassica comparative
genomics, including a recently completed draft physical map of B. rapa integrated
with anchor probes derived from the Arabidopsis genome sequence. We are also
exploring new ways to display the emerging Brassica–Arabidopsis sequence homology
data. We have mapped all publicly available Brassica sequences in silico to the
Arabidopsis TIGR v5 genome sequence and published this in the ATIDB database
that uses Generic Genome Browser (GBrowse). This in silico approach potentially
identifies all paralogous sequences and so we colour-code the significance of the
mappings and offer an integrated, real-time multiple alignment tool to partition them
into paralogous groups. The MySQL database driving GBrowse can also be directly
interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods,
facilitating a wide range of data-mining possibilities.
Brassica rapa, which is closely related to
Arabidopsis thaliana, is an important crop and a
model plant for studying genome evolution via
polyploidization. We report the current understanding of the
genome structure of B. rapa and efforts for the
whole-genome sequencing of the species. The tribe
Brassicaceae, which comprises ca. 240 species,
descended from a common hexaploid ancestor with a basic genome
similar to that of Arabidopsis. Chromosome
rearrangements, including fusions and/or fissions, resulted in
the present-day “diploid” Brassica
species with variation in chromosome number and phenotype.
Triplicated genomic segments of B. rapa are
collinear to those of A. thaliana with InDels.
The genome triplication has led to an approximately 1.7-fold
increase in the B. rapa gene number compared to
that of A. thaliana. Repetitive DNA of B.
rapa has also been extensively amplified and has
diverged from that of A. thaliana. For its
whole-genome sequencing, the Brassica rapa Genome
Sequencing Project (BrGSP) consortium has developed suitable
genomic resources and constructed genetic and physical maps.
Ten chromosomes of B. rapa are being allocated to
BrGSP consortium participants, and each chromosome will be
sequenced by a BAC-by-BAC approach. Genome sequencing of
B. rapa will offer a new perspective for plant
biology and evolution in the context of polyploidization.
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
The complex genome of rapeseed (Brassica napus) is not well understood despite the economic importance of the species. Good knowledge of sequence variation is needed for genetics approaches and breeding purposes. We used a diversity set of B. napus representing eight different germplasm types to sequence genome-wide distributed restriction-site associated DNA (RAD) fragments for polymorphism detection and genotyping.
More than 113,000 RAD clusters with more than 20,000 single nucleotide polymorphisms (SNPs) and 125 insertions/deletions were detected and characterized. About one third of the RAD clusters and polymorphisms mapped to the Brassica rapa reference sequence. An even distribution of RAD clusters and polymorphisms was observed across the B. rapa chromosomes, which suggests that there might be an equal distribution over the Brassica oleracea chromosomes, too. The representation of Gene Ontology (GO) terms for unigenes with RAD clusters and polymorphisms revealed no signature of selection with respect to the distribution of polymorphisms within genes belonging to a specific GO category.
Considering the decreasing costs for next-generation sequencing, the results of our study suggest that RAD sequencing is not only a simple and cost-effective method for high-density polymorphism detection but also an alternative to SNP genotyping from transcriptome sequencing or SNP arrays, even for species with complex genomes such as B. napus.
Brassica napus; Restriction-site associated DNA; Next-generation sequencing; Single nucleotide polymorphism; Genotyping by sequencing; Genetic diversity
A complete genome sequence provides unlimited information in the sequenced organism
as well as in related taxa. According to the guidance of the Multinational Brassica
Genome Project (MBGP), the Korea Brassica Genome Project (KBGP) is sequencing
chromosome 1 (cytogenetically oriented chromosome #1) of Brassica rapa. We
have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH
analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way.
Comparative genome analyses of the EST sequences and sequenced BAC clones from
Brassica chromosome 1 revealed their homeologous partner regions on the Arabidopsis
genome and a syntenic comparative map between Brassica chromosome 1 and
Arabidopsis chromosomes. In silico chromosome walking and clone validation have
been successfully applied to extending sequence contigs based on the comparative
map and BAC end sequences. In addition, we have defined the (peri)centromeric
heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric
retrotransposons. In-depth sequence analyses of five homeologous BAC clones and
an Arabidopsis chromosomal region reveal overall co-linearity, with 82% sequence
similarity. The data indicate that the Brassica genome has undergone triplication and
subsequent gene losses after the divergence of Arabidopsis and Brassica. Based on in-depth
comparative genome analyses, we propose a comparative genomics approach
for conquering the Brassica genome. In 2005 we intend to construct an integrated
physical map, including sequence information from 500 BAC clones and integration
of fingerprinting data and end sequence data of more than 100 000 BAC clones.
The sequences have been submitted to GenBank with accession numbers: 10 204
BAC ends of the KBrH library (CW978640–CW988843); KBrH138P04, AC155338;
KBrH117N09, AC155337; KBrH097M21, AC155348; KBrH093K03, AC155347;
KBrH081N08, AC155346; KBrH080L24, AC155345; KBrH077A05, AC155343;
KBrH020D15, AC155340; KBrH015H17, AC155339; KBrH001H24, AC155335;
KBrH080A08, AC155344; KBrH004D11, AC155341; KBrH117M18, AC146875;
Sequencing of the chloroplast (cp) genome using traditional sequencing methods has been difficult because of its size (>120 kb) and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the cp genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica
rapa accessions with one lane per accession. In total, 246, 362, and 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16, and FT, respectively. Micro-reads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8 or 95.5–99.7% of the B. rapa cp genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of cp genome.
chloroplast genome; sequencing; Solexa sequencing technology; whole cellular DNA; Brassica rapa
Brassica rapa is an important crop species that produces vegetables, oilseed, and fodder. Although many studies reported quantitative trait loci (QTL) mapping, the genes governing most of its economically important traits are still unknown. In this study, we report QTL mapping for morphological and yield component traits in B. rapa and comparative map alignment between B. rapa, B. napus, B. juncea, and Arabidopsis thaliana to identify candidate genes and conserved QTL blocks between them. A total of 95 QTL were identified in different crucifer blocks of the B. rapa genome. Through synteny analysis with A. thaliana, B. rapa candidate genes and intronic and exonic single nucleotide polymorphisms in the parental lines were detected from whole genome resequenced data, a few of which were validated by mapping them to the QTL regions. Semi-quantitative reverse transcriptase PCR analysis showed differences in the expression levels of a few genes in parental lines. Comparative mapping identified five key major evolutionarily conserved crucifer blocks (R, J, F, E, and W) harbouring QTL for morphological and yield components traits between the A, B, and C subgenomes of B. rapa, B. juncea, and B. napus. The information of the identified candidate genes could be used for breeding B. rapa and other related Brassica species.
Brassica rapa; quantitative trait loci (QTL); morphological traits; single nucleotide polymorphism (SNP); conserved genome blocks
Completion of the sequencing of the Brassica rapa genome enabled us to undertake a genome-wide identification and functional study of the gene families related to the morphological diversity and agronomic traits of Brassica crops. In this study, we identified the auxin response factor (ARF) gene family, which is one of the key regulators of auxin-mediated plant growth and development in the B. rapa genome. A total of 31 ARF genes were identified in the genome. Phylogenetic and evolutionary analyses suggest that ARF genes fell into four major classes and were amplified in the B. rapa genome as a result of a recent whole genome triplication after speciation from Arabidopsis thaliana. Despite its recent hexaploid ancestry, B. rapa includes a relatively small number of ARF genes compared with the 23 members in A. thaliana, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative genomic and mRNA sequencing analyses demonstrated that 27 of the 31 BrARF genes were transcriptionally active, and their expression was affected by either auxin treatment or floral development stage, although 4 genes were inactive, suggesting that the generation and pseudogenization of ARF members are likely to be an ongoing process. This study will provide a fundamental basis for the modification and evolution of the gene family after a polyploidy event, as well as a functional study of ARF genes in a polyploidy crop species.
Electronic supplementary material
The online version of this article (doi:10.1007/s00438-012-0718-4) contains supplementary material, which is available to authorized users.
Brassica rapa; Auxin response factor; Genome organization; mRNA sequencing; Evolution
Anthocyanins are flavonoid pigments that are responsible for purple coloration in the stems and leaves of a variety of plant species. Anthocyaninless (anl) mutants of Brassica rapa fail to produce anthocyanin pigments. In rapid-cycling Brassica rapa, also known as Wisconsin Fast Plants, the anthocyaninless trait, also called non-purple stem, is widely used as a model recessive trait for teaching genetics. Although anthocyanin genes have been mapped in other plants such as Arabidopsis thaliana, the anl locus has not been mapped in any Brassica species.
We tested primer pairs known to amplify microsatellites in Brassicas and identified 37 that amplified a product in rapid-cycling Brassica rapa. We then developed three-generation pedigrees to assess linkage between the microsatellite markers and anl. 22 of the markers that we tested were polymorphic in our crosses. Based on 177 F2 offspring, we identified three markers linked to anl with LOD scores ≥ 5.0, forming a linkage group spanning 46.9 cM. Because one of these markers has been assigned to a known B. rapa linkage group, we can now assign the anl locus to B. rapa linkage group R9.
This study is the first to identify the chromosomal location of an anthocyanin pigment gene among the Brassicas. It also connects a classical mutant frequently used in genetics education with molecular markers and a known chromosomal location.
Brassica oleracea encompass a family of vegetables and cabbage that are among the most widely cultivated crops. In 2009, the B. oleracea Genome Sequencing Project was launched using next generation sequencing technology. None of the available maps were detailed enough to anchor the sequence scaffolds for the Genome Sequencing Project. This report describes the development of a large number of SSR and SNP markers from the whole genome shotgun sequence data of B. oleracea, and the construction of a high-density genetic linkage map using a double haploid mapping population.
The B. oleracea high-density genetic linkage map that was constructed includes 1,227 markers in nine linkage groups spanning a total of 1197.9 cM with an average of 0.98 cM between adjacent loci. There were 602 SSR markers and 625 SNP markers on the map. The chromosome with the highest number of markers (186) was C03, and the chromosome with smallest number of markers (99) was C09.
This first high-density map allowed the assembled scaffolds to be anchored to pseudochromosomes. The map also provides useful information for positional cloning, molecular breeding, and integration of information of genes and traits in B. oleracea. All the markers on the map will be transferable and could be used for the construction of other genetic maps.
Cabbage; Brassica; Genetic linkage map; SSR; SNP; Genome
The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level.
We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events.
Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.
Map-based cloning of quantitative trait loci (QTLs) in polyploidy crop species remains a challenge due to the complexity of their genome structures. QTLs for seed weight in B. napus have been identified, but information on candidate genes for identified QTLs of this important trait is still rare.
In this study, a whole genome genetic linkage map for B. napus was constructed using simple sequence repeat (SSR) markers that covered a genetic distance of 2,126.4 cM with an average distance of 5.36 cM between markers. A procedure was developed to establish colinearity of SSR loci on B. napus with its two progenitor diploid species B. rapa and B. oleracea through extensive bioinformatics analysis. With the aid of B. rapa and B. oleracea genome sequences, the 421 homologous colinear loci deduced from the SSR loci of B. napus were shown to correspond to 398 homologous loci in Arabidopsis thaliana. Through comparative mapping of Arabidopsis and the three Brassica species, 227 homologous genes for seed size/weight were mapped on the B. napus genetic map, establishing the genetic bases for the important agronomic trait in this amphidiploid species. Furthermore, 12 candidate genes underlying 8 QTLs for seed weight were identified, and a gene-specific marker for BnAP2 was developed through molecular cloning using the seed weight/size gene distribution map in B. napus.
Our study showed that it is feasible to identify candidate genes of QTLs using a SSR-based B. napus genetic map through comparative mapping among Arabidopsis and B. napus and its two progenitor species B. rapa and B. oleracea. Identification of candidate genes for seed weight in amphidiploid B. napus will accelerate the process of isolating the mapped QTLs for this important trait, and this approach may be useful for QTL identification of other traits of agronomic significance.
Brassicaceae; Rapeseed; Arabidopsis; Comparative mapping; QTL; Map-based cloning; Seed weight
For identification of genes responsible for varietal differences in flowering time and leaf morphological traits, we constructed a linkage map of Brassica rapa DNA markers including 170 EST-based markers, 12 SSR markers, and 59 BAC sequence-based markers, of which 151 are single nucleotide polymorphism (SNP) markers. By BLASTN, 223 markers were shown to have homologous regions in Arabidopsis thaliana, and these homologous loci covered nearly the whole genome of A. thaliana. Synteny analysis between B. rapa and A. thaliana revealed 33 large syntenic regions. Three quantitative trait loci (QTLs) for flowering time were detected. BrFLC1 and BrFLC2 were linked to the QTLs for bolting time, budding time, and flowering time. Three SNPs in the promoter, which may be the cause of low expression of BrFLC2 in the early-flowering parental line, were identified. For leaf lobe depth and leaf hairiness, one major QTL corresponding to a syntenic region containing GIBBERELLIN 20 OXIDASE 3 and one major QTL containing BrGL1, respectively, were detected. Analysis of nucleotide sequences and expression of these genes suggested possible involvement of these genes in leaf morphological traits.
DNA markers; synteny; bolting time; leaf lobe; leaf hairiness
Genome evolution is a continuous process and genomic rearrangement occurs both within and between species. With the sequencing of the Arabidopsis thaliana genome, comparative genetics and genomics offer new insights into plant biology. The genus Brassica offers excellent opportunities with which to compare genomic synteny so as to reveal genome evolution. During a previous genetic analysis of clubroot resistance in Brassica rapa, we identified a genetic region that is highly collinear with Arabidopsis chromosome 4. This region corresponds to a disease resistance gene cluster in the A. thaliana genome. Relying on synteny with Arabidopsis, we fine-mapped the region and found that the location and order of the markers showed good correspondence with those in Arabidopsis. Microsynteny on a physical map indicated an almost parallel correspondence, with a few rearrangements such as inversions and insertions. The results show that this genomic region of Brassica is conserved extensively with that of Arabidopsis and has potential as a disease resistance gene cluster, although the genera diverged 20 million years ago.
microsynteny; genome evolution; genome organization; genomic collinearity; BAC library
The species Brassica rapa includes important vegetable and oil crops. It also serves as an excellent model system to study polyploidy-related genome evolution because of its paleohexaploid ancestry and its close evolutionary relationships with Arabidopsis thaliana and other Brassica species with larger genomes. Therefore, its genome sequence will be used to accelerate both basic research on genome evolution and applied research across the cultivated Brassica species.
We have determined and analyzed the sequence of B. rapa chromosome A3. We obtained 31.9 Mb of sequences, organized into nine contigs, which incorporated 348 overlapping BAC clones. Annotation revealed 7,058 protein-coding genes, with an average gene density of 4.6 kb per gene. Analysis of chromosome collinearity with the A. thaliana genome identified conserved synteny blocks encompassing the whole of the B. rapa chromosome A3 and sections of four A. thaliana chromosomes. The frequency of tandem duplication of genes differed between the conserved genome segments in B. rapa and A. thaliana, indicating differential rates of occurrence/retention of such duplicate copies of genes. Analysis of 'ancestral karyotype' genome building blocks enabled the development of a hypothetical model for the derivation of the B. rapa chromosome A3.
We report the near-complete chromosome sequence from a dicotyledonous crop species. This provides an example of the complexity of genome evolution following polyploidy. The high degree of contiguity afforded by the clone-by-clone approach provides a benchmark for the performance of whole genome shotgun approaches presently being applied in B. rapa and other species with complex genomes.
The Brassica species include an important group of crops and provide opportunities for studying the evolutionary consequences of polyploidy. They are related to Arabidopsis thaliana, for which the first complete plant genome sequence was obtained and their genomes show extensive, although imperfect, conserved synteny with that of A. thaliana. A large number of EST sequences, derived from a range of different Brassica species, are available in the public database, but no public microarray resource has so far been developed for these species.
We assembled unigenes using ~800,000 EST sequences, mainly from three species: B. napus, B. rapa and B. oleracea. The assembly was conducted with the aim of co-assembling ESTs of orthologous genes (including homoeologous pairs of genes in B. napus from each of the A and C genomes), but resolving assemblies of paralogous, or paleo-homoeologous, genes (i.e. the genes related by the ancestral genome triplication observed in diploid Brassica species). 90,864 unique sequence assemblies were developed. These were incorporated into the BAC sequence annotation for the Brassica rapa Genome Sequencing Project, enabling the identification of cognate genomic sequences for a proportion of them. A 60-mer oligo microarray comprising 94,558 probes was developed using the unigene sequences. Gene expression was analysed in reciprocal resynthesised B. napus lines and the B. oleracea and B. rapa lines used to produce them. The analysis showed that significant expression could consistently be detected in leaf tissue for 35,386 unigenes. Expression was detected across all four genotypes for 27,355 unigenes, genome-specific expression patterns were observed for 7,851 unigenes and 180 unigenes displayed other classes of expression pattern. Principal component analysis (PCA) clearly resolved the individual microarray datasets for B. rapa, B. oleracea and resynthesised B. napus. Quantitative differences in expression were observed between the resynthesised B. napus lines for 98 unigenes, most of which could be classified into non-additive expression patterns, including 17 that showed cytoplasm-specific patterns. We further characterized the unigenes for which A genome-specific expression was observed and cognate genomic sequences could be identified. Ten of these unigenes were found to be Brassica-specific sequences, including two that originate from complex loci comprising gene clusters.
We succeeded in developing a Brassica community microarray resource. Although expression can be measured for the majority of unigenes across species, there were numerous probes that reported in a genome-specific manner. We anticipate that some proportion of these will represent species-specific transcripts and the remainder will be the consequence of variation of sequences within the regions represented by the array probes. Our studies demonstrated that the datasets obtained from the arrays can be used for typical analyses, including PCA and the analysis of differential expression. We have also demonstrated that Brassica-specific transcripts identified in silico in the sequence assembly of public EST database accessions are indeed reported by the array. These would not be detectable using arrays designed using A. thaliana sequences.
One theoretical explanation for the relatively poor performance of Brassica rapa (weed) × Brassica napus (crop) transgenic hybrids suggests that hybridization imparts a negative genetic load. Consequently, in hybrids genetic load could overshadow any benefits of fitness enhancing transgenes and become the limiting factor in transgenic hybrid persistence. Two types of genetic load were analyzed in this study: random/linkage-derived genetic load, and directly incorporated genetic load using a transgenic mitigation (TM) strategy. In order to measure the effects of random genetic load, hybrid productivity (seed yield and biomass) was correlated with crop- and weed-specific AFLP genomic markers. This portion of the study was designed to answer whether or not weed × transgenic crop hybrids possessing more crop genes were less competitive than hybrids containing fewer crop genes. The effects of directly incorporated genetic load (TM) were analyzed through transgene persistence data. TM strategies are proposed to decrease transgene persistence if gene flow and subsequent transgene introgression to a wild host were to occur.
In the absence of interspecific competition, transgenic weed × crop hybrids benefited from having more crop-specific alleles. There was a positive correlation between performance and number of B. napus crop-specific AFLP markers [seed yield vs. marker number (r = 0.54, P = 0.0003) and vegetative dry biomass vs. marker number (r = 0.44, P = 0.005)]. However under interspecific competition with wheat or more weed-like conditions (i.e. representing a situation where hybrid plants emerge as volunteer weeds in subsequent cropping systems), there was a positive correlation between the number of B. rapa weed-specific AFLP markers and seed yield (r = 0.70, P = 0.0001), although no such correlation was detected for vegetative biomass. When genetic load was directly incorporated into the hybrid genome, by inserting a fitness-mitigating dwarfing gene that that is beneficial for crops but deleterious for weeds (a transgene mitigation measure), there was a dramatic decrease in the number of transgenic hybrid progeny persisting in the population.
The effects of genetic load of crop and in some situations, weed alleles might be beneficial under certain environmental conditions. However, when genetic load was directly incorporated into transgenic events, e.g., using a TM construct, the number of transgenic hybrids and persistence in weedy genomic backgrounds was significantly decreased.
Polyploidization, both ancient and recent, is frequent among plants. A “two-step theory" was proposed to explain the meso-triplication of the Brassica “A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that “two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa.
Following successful completion of the Brassica rapa sequencing project, the next step is to investigate functions of individual genes/proteins. For Arabidopsis thaliana, large amounts of protein–protein interaction (PPI) data are available from the major PPI databases (DBs). It is known that Brassica crop species are closely related to A. thaliana. This provides an opportunity to infer the B. rapa interactome using PPI data available from A. thaliana. In this paper, we present an inferred B. rapa interactome that is based on the A. thaliana PPI data from two resources: (i) A. thaliana PPI data from three major DBs, BioGRID, IntAct, and TAIR. (ii) ortholog-based A. thaliana PPI predictions. Linking between B. rapa and A. thaliana was accomplished in three complementary ways: (i) ortholog predictions, (ii) identification of gene duplication based on synteny and collinearity, and (iii) BLAST sequence similarity search. A complementary approach was also applied, which used known/predicted domain–domain interaction data. Specifically, since the two species are closely related, we used PPI data from A. thaliana to predict interacting domains that might be conserved between the two species. The predicted interactome was investigated for the component that contains known A. thaliana meiotic proteins to demonstrate its usability.
Brassica rapa; Arabidopsis thaliana; interactome; protein–protein interaction; domain–domain interaction; meiosis