|Home | About | Journals | Submit | Contact Us | Français|
Genome evolution is a continuous process and genomic rearrangement occurs both within and between species. With the sequencing of the Arabidopsis thaliana genome, comparative genetics and genomics offer new insights into plant biology. The genus Brassica offers excellent opportunities with which to compare genomic synteny so as to reveal genome evolution. During a previous genetic analysis of clubroot resistance in Brassica rapa, we identified a genetic region that is highly collinear with Arabidopsis chromosome 4. This region corresponds to a disease resistance gene cluster in the A. thaliana genome. Relying on synteny with Arabidopsis, we fine-mapped the region and found that the location and order of the markers showed good correspondence with those in Arabidopsis. Microsynteny on a physical map indicated an almost parallel correspondence, with a few rearrangements such as inversions and insertions. The results show that this genomic region of Brassica is conserved extensively with that of Arabidopsis and has potential as a disease resistance gene cluster, although the genera diverged 20 million years ago.
Arabidopsis thaliana is an excellent model species for dicots, with a small and relatively simple genome (125 Mb) and a diverse range of genetic resources. The completion of its genome sequence has yielded new insights into plant genetics and genomics (Arabidopsis Genome Initiative 2000). Because patterns of chromosomal collinearity have been identified between Arabidopsis and other flowering plants, the Arabidopsis genome sequences provide an excellent resource for identifying the genes controlling target phenotypes, i.e. physically and functionally conserved genes, which are known as orthologues and paralogues (Axelsson et al. 2001, Grant et al. 1998, Rossberg et al. 2001).
The genus Brassica is closely related to Arabidopsis. The genetic relations within Brassica have been well studied and are explained by “U’s triangle” (U 1935). The genomes of three diploid species, B. rapa, B. nigra and B. oleracea, are designated as A, B and C, respectively and those of three amphidiploids, B. juncea, B. napus and B. carinata, as AB, AC and BC, respectively (U 1935). Because natural hybridization both within and between species has created wide genetic and morphological variation, the remarkable diversity within Brassica is an excellent source for understanding how genetic and morphological variation has developed during the evolution of plant genomes. Several studies have established chromosomal collinearity between Brassica and Arabidopsis (reviewed in Schmidt 2002, Schmidt et al. 2001) and the degree of synteny between them provides a good opportunity to study how genetic and morphological variation has developed during the evolution of the genome, including the endurance of certain genetic structures. Such investigation would lead to a better understanding of plant genetics and genomics, including molecular biological and physiological aspects which have emerged during evolution. More generally, it would provide insights into the organization and function of larger and more complex genomes in plants.
It is widely accepted that the diploid Brassica species evolved from a common hexaploid ancestor and that large-scale genome duplication (polyploidization) has occurred via evolution (Cavell et al. 1998, Parkin et al. 2002, Schmidt et al. 2001). Studies based on genetic mapping revealed conserved genomic segments with a similar conserved gene content and order between species. These observations were confirmed at the micro-level in selected chromosomal regions (O’Neill and Bancroft 2000, Quiros et al. 2001, Sadowski et al. 1996) and at the macro-level in the whole genome (Mun et al. 2009, Town et al. 2006). At the same time, complex genomic relationships have also been identified within Brassica and between Brassica and Arabidopsis. Microsynteny analysis identified potential inversion, insertion and gene loss in Brassica homoeologous genomic segments relative to Arabidopsis homologous segments (O’Neill and Bancroft 2000, Rana et al. 2004). This supports the idea that fine-scale chromosome rearrangements in Brassica occurred both before and after divergence of species. So although comparative analysis should take prominence in plant genetics and genomics, chromosomal evolution between species complicates the picture.
This study focuses on the genome of B. rapa, a diploid species (2n = 20) with the smallest genome (550 Mb) in Brassica. During research on clubroot resistance in B. rapa, we found three independent QTLs for resistance with distinct functions (Suwabe et al. 2006). Two of them, named Crr1 and Crr2, have synteny with a small region of Arabidopsis chromosome 4. This region houses a disease resistance gene cluster (major recognition complex: MRC) in the A. thaliana genome. This similarity suggests that the genes for clubroot resistance originated from an MRC in a common ancestral genome and were subsequently distributed to the different regions they now inhabit in the process of evolution. We aimed here at characterizing the genomic microstructure for a potential disease resistance gene cluster in the Brassica genome through fine-scale genetic mapping and physical mapping, and we discuss the physical and genomic relationships of this region with respect to Arabidopsis MRCs. We also discuss characteristic features of this region by comparison with the draft genome sequence of B. rapa, which was recently reported (BRGSPC 2011).
Chinese cabbage doubled-haploid line G004 and parental line Nou 7 (A9709) were used as the parents for the mapping population and G004 was also used for BAC library construction. G004 is derived from a clubroot-resistant cultivar of European fodder turnip, ‘Siloga’ (Kuginuki et al. 1997). For fine mapping, an F2 population of 1920 plants was obtained from a cross of A9709 × G004 and F3 seeds of each F2 plant were obtained by bud self-pollination. All F2 plants were grown in a greenhouse.
Our previous study revealed that Crr1, a major QTL for clubroot resistance in B. rapa located within a 1.6-cM region between SSR markers BRMS-173 and BRMS-088 in linkage group (LG) 7 (A8), had homology with the genomic region of MRC-H (Holub 1997, Speulman et al. 1998) on A. thaliana chromosome 4. This region houses three SNP/indel markers, BSA1, 2 and 7, which were designed by synteny with Arabidopsis (Suwabe et al. 2006). In addition, two CAPS markers, AT27 and BZ2-DraI, were designed by the same strategy. All markers cosegregated completely with BRMS-173/BRMS-088 in the F2 population of 94 plants analyzed before. For fine mapping of this region, we used these markers for genetic analysis in the F2 population of 1920 plants. F3 individuals of the recombinant F2 plants were used for the clubroot resistance test to identify recombination between BRMS-173 and BRMS-088. The test for clubroot resistance and data evaluation are described in a previous report (Suwabe et al. 2003).
Young leaves were collected from G004 and immediately frozen in liquid nitrogen before being stored at −80°C. The nuclei were extracted as described by Suzuki et al. (2001). The extracted nuclei were embedded in agarose plugs and lysed twice in lysis buffer (100 mM EDTA, 1% Sarkosyl, 0.1 mg/mL proteinase K) for 24 h at 50°C. The plugs were rinsed with TE50 (10 mM Tris·HCl, pH 8.0, 50 mM EDTA) plus 2 mM phenylmethylsulfonylfluoride (PMSF) for 2 h at 50°C and re-rinsed 5 times with TE50 for 2 h at 4°C, and then stored at 4°C until use.
Plug-embedded high-molecular-weight genomic DNA was partially digested with HindIII and size-fractionated by pulsed-field gel electrophoresis (CHEF DR II, Bio-Rad, Hercules, CA, USA), first at 6 V/cm with 90 s pulse time for 4 h and then at 6 V/cm with 6 s pulse time for 12 h. The gel slices containing DNA fragments of ~150 kb in size were excised and incubated with 10 U/g ß-agarase I (New England Biolabs, Ipswich, MA, USA) for 2 h at 40°C. After incubation, the DNA solution was concentrated at 45°C and dialyzed through 0.025-μm nitrocellulose membrane (Millipore, Billeria, MA, USA) to a suitable concentration. The CopyControl pCC1BAC vector (Epicentre, Madison, WI, USA) was used as a cloning vector for ligation at the molecular ratio of 5 vectors to 1 insert. The ligation reaction followed the manufacturer’s instructions (CopyControl BAC cloning kit). Aliquots of ligation solution were electroporated into E. coli TransforMax EPI300 cells (Epicentre) in a Gene Pulser II electroporation system (Bio-Rad) at 1.25 kV, 25 mF and 100 Ω. The electroporated cells were spread onto an LB plate containing 12.5 μg/mL chloram-phenicol, X-Gal and IPTG and incubated for 24 h at 37°C. Each recombinant (white colony) was picked up and inoculated into 200 μL LB medium with 12.5 μg/mL chloram-phenicol and freezing buffer (Zimmer and Verrinder Gibbins 1997) in a 96-well microtiter plate. After incubation at 37°C overnight, the plates were stored at −80°C.
BAC libraries were screened by a two-step PCR-based strategy with each anchor and BAC-end marker. First, 96 BAC clones were pooled together as a “bin” and plasmid DNAs were extracted from them by the standard alkali method. The first screening used 2-D scoring of 400 bins and the bins which showed a marker-specific amplification by PCR were selected. The individual positive clone in a bin was determined by the second screening, in which the 96 individual clones were screened. BAC-end sequencing was conducted with T7 and Reverse universal primers in a model 3100 genetic analyzer (Applied Biosystems, Foster City, CA, USA) and the BAC-end specific primers were designed from the region to amplify products of ~200–500 bp. After the specific PCR amplification by the designed BAC-end primers was confirmed with the genomic DNA of G004, BAC contigs for the target region were screened and assembled by chromosome walking on the basis of overlaps between BAC clones by specific PCR amplification.
The nucleotide sequences of each BAC end of B. rapa were aligned with Arabidopsis genome sequences by BLASTN search in the Arabidopsis database (TAIR: http://www.arabidopsis.org). Genes with high homology to Arabidopsis genes (threshold value of E < 10−10) were regarded as homologous. Genes with lower homology (10−10 < E < 10−5) were confirmed manually. In addition, to assess the conservation of disease resistance genes (R-genes) in the region, we selected 11 candidate genes from the corresponding region of Arabidopsis chromosome 4, designed specific primers, and compared nucleotide sequences between B. rapa and A. thaliana.
For fine mapping of the 1.6-cM genetic region between BRMS-088 and BRMS-173, we genotyped three SNP/indel and two CAPS markers in an F2 population of 1920 plants. This genetic region had been identified as syntenous with a central region of the long arm of Arabidopsis chromosome 4 (Suwabe et al. 2006). The region in Arabidopsis has been located within a disease resistance gene cluster (MRC-H), in which many R-genes with typical motifs such as nucleotide-binding sites (NBSs) and leucine-rich repeats (LRRs) lie close to each other (Holub 1997, Hulbert et al. 2001, Speulman et al. 1998). All markers were mapped between BRMS-088 and BRMS-173 as expected (Fig. 1A). Their order corresponded well with that in the Arabidopsis genome, indicating that this region is homoeologous between Brassica and Arabidopsis. The genetic distances between markers from BRMS-088 to BRMS-173 totaled 3.9 cM (Fig. 1A). The total value was 2.4 times that of a previous estimate in the F2 population of 94 plants (Suwabe et al. 2006). In that previous population, this region showed a genetic deviation from the Mendelian ratio (1 : 2 : 1) that might not reflect the genetic distance accurately.
We found 147 recombinant F2 plants in the genomic region between BRMS-173 and BRMS-088. Among them, F3 seeds (populations) were obtained from 124 F2 plants while another 23 F2 plants did not produce enough seeds for the resistance test and the infection test was carried out among 64 F3 populations. Except of one population, all plants in 24 F3 populations, harboring resistance genotype in BSA7, showed resistance phenotype (Fig. 1B). In contrast, most of plants in 40 F3 populations, harboring susceptible genotype in BSA7, showed susceptible phenotype. This indicates that Crr1 in located around BSA7. In addition, the plants with resistance homozygous in BSA1 and susceptible homozygous in BSA2 segregated their phenotypes both in resistance and susceptible (the 5th group from upper in the graphical genotype in Fig. 1B), another minor effect was also predicted around BSA2. Thus Crr1 for clubroot resistance is likely to consist of two genetic loci located in the region around BSA2–BSA7.
For physical mapping, we constructed a large-insert BAC library using B. rapa line G004. The library consisted of 38 400 clones. Sixty of these were selected randomly and cut for measurement of insert size by NotI digestion (Fig. 2). Insert sizes ranged from 7 to 138 kb, with an average of 67.4 kb. Despite our using DNA fractions of ~150 kb for library construction, some clones had smaller inserts and average size was lower than in earlier reports (O’Neill and Bancroft 2000, Rana et al. 2004), likely owing to entrapment of small DNA fragments in the agarose gel, which two-step size fractionation would avoid (Nakamura et al. 1997). Overall, our library would provide ca. 4.7-fold redundant representation of the B. rapa genome (Arumuganathan and Earl 1991).
Because Crr1 was delimited to the region of around BSA2-BSA7 by the recombination test (Fig. 1B), we assembled a BAC contig of the 0.6-cM region between BSA2 and BSA7, using BSA2, AT27 and BSA7 as anchors (Fig. 1C). The anchors and BAC-end markers identified 2 to 11 clones, with an average of 5.3 clones per marker (Fig. 1C). However, in PCR analysis to confirm the cross-linking of each clone, some clones did not show any correlation with others that were screened with same marker. Those ambiguous paralogous or chimeric clones were excluded from analysis. Consequently, by screening with three anchor and six BAC-end specific markers, we identified a total of 28 clones in the region covering 0.6 cM. The genomic region between BSA2 and AT27 was covered by only one clone (289F12, 112.4 kb) and the region between AT27 and BSA7 was covered with six overlapping clones and the physical distance was estimated as ~599.9 kb.
To evaluate the microsynteny of B. rapa and A. thaliana, we used Brassica BAC-end sequences to seek homologous genes in the Arabidopsis genome. Several homologous genes on chromosome 4 were found in the resulting Brassica BAC contig (Fig. 3 and Table 1). The order of genes was well conserved between species, except for one inversion. The inverted region housed genes associated with other chromosomes of Arabidopsis, most of which encoded multi-copy genes or gene families such as those encoding F-box proteins, protein kinases, homeobox elements and retro-transposons. Because there was no continuous synteny with other chromosome regions, the insertions would have resulted from occasional events after diversification between Brassica and Arabidopsis.
To assess the conservation of R-genes in the region, selected 11 candidate genes were evaluated between B. rapa and A. thaliana. The genes have characteristic motifs for R-genes, such as NBSs and LRRs, but the functions of some genes have not been characterized yet, so putative genes were predicted. Four of these genes were amplified from BAC clones which cover the BSA2–BSA7 contig: At4g19920 from BAC clone 289F12, At4g20270 and At4g20380 from 289F12 and 188D5 and At4g21450 from 321C5 and 150F3. The position of each gene corresponded with other homologous genes identified by comparison of the BAC-end sequence with the Arabidopsis genome (Fig. 1C, Fig. 3). These results indicated that this syntenous region originated from the region in the common ancestor genome which corresponded to part of Arabidopsis chromosome 4, although some rearrangements have occurred since diversification between Brassica and Arabidopsis. The Brassica genomic regions of BSA2 to AT27 (~112.4 kb) and AT27 to BSA7 (~599.9 kb) corresponded to Arabidopsis regions of 115 kb and 695.6 kb, respectively, indicating that these regions remain almost parallel between species.
O’Neill and Bancroft (2000) reported that the 222-kb Arabidopsis genomic segment containing genes At4g17260–At4g17800, just upstream of our site on chromosome 4, was partially duplicated on chromosome 5 and triplicated in the B. oleracea genome, although extensive divergence of gene contents was evident. Those homologous segments of B. oleracea were 1.01 to 4.44 times the size of those of Arabidopsis, suggesting a different diversification of each after the divergence of Brassica and Arabidopsis. The genome triplication of diploid Brassica species is widely accepted, as seen in other diploid Brassica species (O’Neill and Bancroft 2000) and amphidiploid species (Parkin et al. 2005) and can be seen in several genomic regions (Park et al. 2005, Rana et al. 2004). Recently, the Multinational Brassica Genome Project (MBGP) consortium reported the draft genome sequence of B. rapa, which is equivalent to half of the whole genome, covering over 98% (283.8 Mb) of the gene space in the B. rapa genome (BRGSPC 2011). The MBGP confirmed the almost complete triplication of the B. rapa genome relative to A. thaliana; 91.13% (259.6 Mb) of the B. rapa genome assembly contained blocks that were collinearly syntenous with the A. thaliana genome. At the same time, however, there were significant disparities in gene loss across the triplicated blocks: in the least fractionated blocks, 70% of the genes were retained within the syntenous regions; in the moderately fractionated blocks, 46% were retained; while in the most fractionated blocks, 36% were retained. Thus the complexity of the chromosomal rearrangements that occurred both before and after the differentiation of Brassica species is clear and evolution has expanded the genome (Quiros et al. 2001). However, the region of focus on A8 (LG7) may be different. The physical sizes corresponding to BSA2–AT27 and AT27–BSA7 (Fig. 1) were estimated to be ~112.4 and ~599.9 kb, respectively, in B. rapa and 115 and 695.6 kb in Arabidopsis. No blocks showing continuous synteny with other chromosomes were found in the region in B. rapa (Fig. 3). These results indicate that this region is almost parallel between species, although B. rapa has a far larger genome than Arabidopsis. The average rate of expansion of a Brassica genome segment would be ~1.5 times that of Arabidopsis when we account for the genome size of each species and the triplication of the Brassica genome. Thus, the evolution of the Brassica genome is more complicated than simple triplication. In fact, the region on A8 and the Crr2 region in A1 (LG6) overlap on chromosome 4 in A. thaliana (Suwabe et al. 2006), but the fine-scale genetic mapping and physical microstructure of the region reveal no genes in common (Fig. 3). This result is consistent with the observation that each Brassica homologous region has a different level of genome rearrangement (BRGSPC 2011, O’Neill and Bancroft 2000) and illustrates a non-uniform genome evolution in Brassica, with substantial gene retention and gene loss.
Brassica rapa and A. thaliana originated from a common ancestor and differentiated between about 20.4 and 14.5 million years ago (Koch et al. 2000, Yang et al. 1999). Recent studies that have gradually identified the genome structure of Brassica suggest that functionally conserved genes are maintained as genomic blocks. For example, the self-incompatibility (S) locus is inherited as a block in Brassica (Sato et al. 2002) and in Arabidopsis (Kusaba et al. 2001). The S locus contains at least two genes, SRK and SP11/SCR, which are the determinants of self-incompatibility in stigma and pollen (Schopfer et al. 1999, Takasaki et al. 2000, Takayama et al. 2000). Although they are highly polymorphic, they are inherited as a unit and have maintained self-incompatibility through co-evolution. The genomic region on A8 is highly conserved between Brassica and Arabidopsis (Fig. 3 and Table 1). The disease resistance gene cluster (MRC-H) of which it is part includes many genes for resistance to bacterial, fungal, and viral pathogens. Resistance genes often reside as a cluster complex in the genome, because the clustering of multiple genes with different capabilities enables the plant to maintain its defenses against corresponding pathogens (Holub 1997, Wei et al. 2002). At the same time, it allows the generation of diversity by recombination, accelerating the evolution of novel specificities for resistance (Hulbert et al. 1997). Thus the region on A8 shows genomic potential as an R-gene cluster in the Brassica genome. Further functional analyses, such as physiological and resistance analysis of each gene, will shed light on the evolution of R-gene clusters in the Brassica genome.
The region on A8 has a large inversion, corresponding to about 310 kb in the Arabidopsis genome, and insertion of genes homologous to those in other Arabidopsis chromosomes (Fig. 3). Gene deletion is also likely, although our strategy would not have shown it. The gene insertions are concentrated in a very small part of the inverted region, suggesting that a hot spot of genomic substitution occurs there. The suppression of genetic modification by the need to maintain function leads to an accumulation of repeat sequences and transposable elements (Fu et al. 2002). In barley, for example, the Mla locus for resistance to powdery mildew shows clustering and conservation in the genome, with a variety of duplications, inversions and transposon insertions (Wei et al. 2002). In addition, genes in the Gene Ontology category of “response to environment (including pathogen defense)” are over represented in the B. rapa genome, with an apparent bias (BRGSPC 2011). Taken together, these results suggest that the genomic region on A8 is inherited as a unit and the disease resistance cluster has been maintained through genome evolution with rearrangements in a hot spot of genomic substitution.
Our results confirm genomic microsynteny between B. rapa and A. thaliana and support the view that the origin of the diploid Brassica genome is more complicated than simple triplication. A potential for a disease resistance cluster in the Brassica genome is also suggested. For an accurate overview of genome evolution and the functional R-gene cluster in the Brassica genome, the ongoing complete sequencing of Brassica species will provide insights into mechanisms underlying the conservation and rearrangement of the genomes of differentiating species.
We thank Dr. Ian Bancroft of the John Innes Centre for his valuable comment and advice. We are grateful to Ms. K. Tanaka and Ms. H. Maeda for their technical assistance. This work was supported by the Cooperative System for Supporting Priority Research of the Japan Science and Technology Corporation, by a grant from the Ministry of Agriculture, Forestry and Fisheries of Japan (Rice Genome Project DM-2105) and by a grant-in-aid from the Ministry of Education, Culture, Sports, Science and Technology of Japan (No. 14360006).