1.  A naturally occurring InDel variation in BraA.FLC.b (BrFLC2) associated with flowering time variation in Brassica rapa 
BMC Plant Biology  2012;12:151.
Flowering time is an important trait in Brassica rapa crops. FLOWERING LOCUS C (FLC) is a MADS-box transcription factor that acts as a potent repressor of flowering. Expression of FLC is silenced when plants are exposed to low temperature, which activates flowering. There are four copies of FLC in B. rapa. Analyses of different segregating populations have suggested that BraA.FLC.a (BrFLC1) and BraA.FLC.b (BrFLC2) play major roles in controlling flowering time in B. rapa.
We analyzed the BrFLC2 sequence in nine B. rapa accessions, and identified a 57-bp insertion/deletion (InDel) across exon 4 and intron 4 resulting in a non-functional allele. In total, three types of transcripts were identified for this mutated BrFLC2 allele. The InDel was used to develop a PCR-based marker, which was used to screen a collection of 159 B. rapa accessions. The deletion genotype was present only in oil-type B. rapa, including ssp. oleifera and ssp. tricolaris, and not in other subspecies. The deletion genotype was significantly correlated with variation in flowering time. In contrast, the reported splicing site variation in BrFLC1, which also leads to a non-functional locus, was detected but not correlated with variation in flowering time in oil-type B. rapa, although it was correlated with variation in flowering time in vegetable-type B. rapa.
Our results suggest that the naturally occurring deletion mutation across exon 4 and intron 4 in BrFLC2 gene contributes greatly to variation in flowering time in oil-type B. rapa. The observed different relationship between BrFLC1 or BrFLC2 and flowering time variation indicates that the control of flowering time has evolved separately between oil-type and vegetable-type B. rapa groups.
PMCID: PMC3487953  PMID: 22925611
2.  Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa 
PLoS ONE  2012;7(5):e36442.
Polyploidization, both ancient and recent, is frequent among plants. A “two-step theory" was proposed to explain the meso-triplication of the Brassica “A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that “two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa.
PMCID: PMC3342247  PMID: 22567157
3.  Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants 
The well supported gene dosage hypothesis predicts that genes encoding proteins engaged in dose–sensitive interactions cannot be reduced back to single copies once all interacting partners are simultaneously duplicated in a whole genome duplication. The genomes of extant flowering plants are the result of many sequential rounds of whole genome duplication, yet the fraction of genomes devoted to encoding complex molecular machines does not increase as fast as expected through multiple rounds of whole genome duplications. Using parallel interspecies genomic comparisons in the grasses and crucifers, we demonstrate that genes retained as duplicates following a whole genome duplication have only a 50% chance of being retained as duplicates in a second whole genome duplication. Genes which fractionated to a single copy following a second whole genome duplication tend to be the member of a gene pair with less complex promoters, lower levels of expression, and to be under lower levels of purifying selection. We suggest the copy with lower levels of expression and less purifying selection contributes less to effective gene-product dosage and therefore is under less dosage constraint in future whole genome duplications, providing an explanation for why flowering plant genomes are not overrun with subunits of large dose–sensitive protein complexes.
PMCID: PMC3355610  PMID: 22639677
polyploidy; gene dosage; gene loss; genome evolution; comparative genomics; crucifers; grasses
4.  Syntenic gene analysis between Brassica rapa and other Brassicaceae species 
Chromosomal synteny analysis is important in genome comparison to reveal genomic evolution of related species. Shared synteny describes genomic fragments from different species that originated from an identical ancestor. Syntenic genes are orthologs located in these syntenic fragments, so they often share similar functions. Syntenic gene analysis is very important in Brassicaceae species to share gene annotations and investigate genome evolution. Here we designed and developed a direct and efficient tool, SynOrths, to identify pairwise syntenic genes between genomes of Brassicaceae species. SynOrths determines whether two genes are a conserved syntenic pair based not only on their sequence similarity, but also by the support of homologous flanking genes. Syntenic genes between Arabidopsis thaliana and Brassica rapa, Arabidopsis lyrata and B. rapa, and Thellungiella parvula and B. rapa were then identified using SynOrths. The occurrence of genome triplication in B. rapa was clearly observed, many genes that were evenly distributed in the genomes of A. thaliana, A. lyrata, and T. parvula had three syntenic copies in B. rapa. Additionally, there were many B. rapa genes that had no syntenic orthologs in A. thaliana, but some of these had syntenic orthologs in A. lyrata or T. parvula. Only 5,851 genes in B. rapa had no syntenic counterparts in any of the other three species. These 5,851 genes could have originated after B. rapa diverged from these species. A tool for syntenic gene analysis between species of Brassicaceae was developed, SynOrths, which could be used to accurately identify syntenic genes in differentiated but closely-related genomes. With this tool, we identified syntenic gene sets between B. rapa and each of A. thaliana, A. lyrata, T. parvula. Syntenic gene analysis is important for not only the gene annotation of newly sequenced Brassicaceae genomes by bridging them to model plant A. thaliana, but also the study of genome evolution in these species.
PMCID: PMC3430884  PMID: 22969786
synteny; ortholog; Brassica rapa; Arabidopsis thaliana; Arabidopsis lyrata; Thellugiella parvula; Brassicaceae
5.  The Impact of Genome Triplication on Tandem Gene Evolution in Brassica rapa 
Whole genome duplication (WGD) and tandem duplication (TD) are both important modes of gene expansion. However, how WGD influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT) and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata, and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751, and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata, and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the three species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit) in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole genome polyploidization event.
PMCID: PMC3509317  PMID: 23226149
whole genome duplication; tandem duplication; tandem gene evolution; Brassica rapa; Arabidopsis thaliana; Arabidopsis lyrata; Thellungiella parvula

