The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level.
We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events.
Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.