More than 40 years ago, based on a few protein sequences from vertebrates, Susumu Ohno proposed polyploidization as a major source of new biological pathways created from duplicated gene copies [1
]. The vertebrate genomes can be considered as paleopolyploids that had become modern diploids by means of ancestral chromosome fusions as well as sequence divergence between duplicated chromosomes. Recent paleogenomic analyses in plants have confirmed and refined Ohno's conclusions and led to the identification of polyploid common ancestors, showing that present-day species have been shaped through several rounds of whole genome duplications (WGDs), small scale duplications (SSDs) as well as copy number variations (CNVs) of tandem duplicated genes followed by numerous chromosome fusion (CF) events leading to the their present-day chromosome numbers [2
]. Duplicate genes that persisted in multiple copies diverged by differentiation of sequence and/or function. Overall, recurrent gene or genome duplications generate functional redundancy followed either by pseudogenization (that is, unexpressed or functionless paralogs), concerted evolution (that is, maintained function of paralogs), subfunctionalization (that is, partitioned function of paralogs), or neofunctionalization (that is, novel function of paralogs) during the course of genome evolution. Functional divergence either by subfunctionalization or neofunctionalization of duplicated genes has been proposed as one of the most important sources of evolutionary innovation in living organisms [5
]. As a consequence, polyploidy followed by diploidization is a major mechanism that has shaped complex regulatory networks during the evolution of the plant genomes. However, the real impact of genome duplication on gene network evolution, by comparing ancestral pre-WDG networks to modern post-WGD networks, is not clear. Recent access to numerous sequenced plant genomes [4
] now offers the opportunity to study, at an unprecedented resolution, the impact of WGD on gene and genome organization as well as regulation.
Recent paleogenomics studies in plants aiming at comparing modern genome sequences to reconstruct their common founder ancestors based on the characterization of shared duplication events allowed the characterization of seven genome paleoduplications for the monocots and seven genome paleotriplications for the eudicots. These data led to the construction of extinct ancestors of seven protochromosomes (9,731 protogenes) and five protochromosomes (9,138 protogenes) for the eudicots and monocots, respectively [4
] (Figure ). These recent evolutionary studies in plants suggest that most duplicated genes that are structurally retained during evolution (referred to as 'persistent duplicated genes') have at least partially diverged in their function [6
]. Microarray studies in eudicots and monocots showed that the vast majority of duplicated genes have diverged in their expression profiles, with 73% [8
] and 88% [10
] of gene pairs in Arabidopsis
(eudicot reference genome) and rice (monocot reference genome), respectively, associated with asymmetric expression profiles after 50 to 100 million years of evolution. In maize, where a recent WGD dating back to 5 million years ago (MYA) occurred [11
], more than 50% of the duplicated genes have been deleted and are no longer detectable within paralogous chromosomal blocks [12
]. These results clearly demonstrate that most of the genetic redundancy originating from polyploidy events is erased by a massive loss of duplicated genes by pseudogenization in one of the duplicated segments soon after the polyploidization event.
Figure 1 Homolog gene conservation between wheat and cereal sequenced genomes. (a) Cereal genome paleohistory. Schematic representation of the phylogenetic relationships between grass species adapted from [2,4]. Divergence times from a common ancestor are indicated (more ...)
Because many genes are part of more global regulatory networks, a change in the expression pattern of a single gene could induce changes for numerous genes involved in the same functional pathway. Haberer et al.
] noted for example that tandem as well as segmental duplicate gene pairs exhibiting high cis
-element similarities within promoters had divergent expression in Arabidopsis
, suggesting that changes to a small fraction of cis
-elements could be sufficient for neo- or subfunctionalization. We can argue that functional novelties derived from neo- or subfunctionalization of orthologous and paralogous copies may reduce the risk of extinction of plant species [14
], similar to what has been suggested in mammals, where extinction events of vertebrate lineages is higher prior to the known ancestral WGD [16
]. In this scenario, rapid genomic (that is, reciprocal gene loss) and functional changes (that is, neo- or subfunctionalization) following WGD might enable polyploids to better or quickly adapt to environmental conditions with improved physiological and morphological traits and properties that were not present or sufficient in their diploid progenitors. For instance, it has been suggested that neo- or paleopolyploidy may increase vigor [17
], favor tolerance to environmental changes [15
], and facilitate propagation through increased self-fertilization species [18
To gain insight into the impact of genome doubling on gene structure and expression, we performed high-throughput RNA sequencing (RNA-seq)-based inference of the grain filling gene network in bread wheat. We focused our functional experiments on a grain developmental kinetic to be able to run comparable experiments in other cereals (for example, rice in the next sections) based on the main conserved grain developmental phases: cell division, filling, and dehydration. Bread wheat is a good plant model to study the impact of distinct rounds of WGD on gene structure and function, as its genome comprises seven ancestral paleoduplications shared with all known cereal genomes and two recent neopolyploidization events to form Triticum aestivum
, which originated from two hybridizations, one between Triticum urartu
(A genome) and an Aegilops speltoides
-related species (B genome) 1.5 to 3 MYA, forming Triticum turgidum
, and one between T. turgidum
(genomes A-B) and Aegilops tauschii
(D genome) 10,000 years ago [20
]. Bread wheat is thus a good genome model to study in the same analysis the impact of ancient and recent WGD on genome structure and function. The bread wheat genome architecture offers us the opportunity to study not only the structures and corresponding expression patterns of paleoduplicated genes (50 to 70 million years of evolution) but also neoduplicated genes (1.5 to 3 million years of evolution) by comparing expression profiles of A, B and D homoeologous gene copies, that is, homoeoalleles (Figure ). As the complete assembled wheat genome sequence is not yet available, we have used Brachypodium
as reference genomes to investigate the grain filling gene network modification in response to recent and ancient evolutionary events, such as duplication, polyploidization and speciation. The aim of this study was not to perform a quantitative (that is, transcriptome) analysis of the genes expressed during grain development but rather a robust qualitative identification (that is, large scale repertoire) of homoeologous/orthologous/paralogous gene networks, allowing us to provide new insights into the structural and functional evolution of genes after a WGD event in plants. This article provides relevant conclusions on how recent and ancient duplicated genes in plants evolve in both structure and function at the whole genome level, the gene family level, and the gene network level. The established divergence of structural and expression patterns between duplicated genes might have accelerated the erosion of colinearity between plant genomes as discussed in the article.