|Home | About | Journals | Submit | Contact Us | Français|
Polyploidy, or whole-genome duplication (WGD), is an important genomic feature for all eukaryotes, especially many plants and some animals. The common occurrence of polyploidy suggests an evolutionary advantage of having multiple sets of genetic material for adaptive evolution. However, increased gene and genome dosages in autopolyploids (duplications of a single genome) and allopolyploids (combinations of two or more divergent genomes) often cause genome instabilities, chromosome imbalances, regulatory incompatibilities, and reproductive failures. Therefore, new allopolyploids must establish a compatible relationship between alien cytoplasm and nuclei and between two divergent genomes, leading to rapid changes in genome structure, gene expression, and developmental traits such as fertility, inbreeding, apomixis, flowering time, and hybrid vigor. Although the underlying mechanisms for these changes are poorly understood, some themes are emerging. There is compelling evidence that changes in DNA sequence, cis- and trans-acting effects, chromatin modifications, RNA-mediated pathways, and regulatory networks modulate differential expression of homoeologous genes and phenotypic variation that may facilitate adaptive evolution in polyploid plants and domestication in crops.
Polyploids can be classified into allopolyploids and autopolyploids based on the origins and levels of ploidy (25, 49, 135) (Figure 1). An autopolyploid results from doubling a diploid genome (Figure 1a). An allopolyploid is formed by the combination of two or more sets of distinct genomes). Mechanisms include interspecific hybridization followed by chromosome doubling (Figure 1b), fertilization of unreduced gametes between two diploid species (Figure 1c), or interspecific hybridization between two autotetraploids (Figure 1d). The two identical chromosomes (red or blue) within a species are homologous, while the chromosomes derived from different species (red and blue) are orthologous but become homoeologous within an allotetraploid. In an allopolyploid, only bivalents are formed because meiotic pairing occurs between homologous chromosomes (Figure 1b,c,d). If the homoeologous chromosomes have some segments that are homologous (Figure 1e), pairing may occur between the homoeologous chromosomes, resulting in the formation of multivalents and segmental allotetraploids (135). Here we consider allopolyploids and amphidiploids or disomic allopolyploids to be synonyms. Strictly speaking, only bivalents are formed in the amphidiploids and disomic allopolyploids, whereas multivalents may be formed in the allopolyploids or segmental allopolyploids.
In addition to polyploidy, some plant and animal species exist as intraspecific and inter-specific hybrids (96, 124). Many plants that transmit as diploids are actually paleopolyploids (ancient polyploids), which are derived from at least one event of whole-genome duplication (WGD) followed by massive gene loss and genomic reorganization through a process known as diploidization (152). Arabidopsis (14, 17, 144), rice (154), and maize (45) are good examples of diploidized paleopolyploids. An estimated 30–70% of plant species are of polyploid origin (93, 152). That estimate is as high as 100% if paleopolyploids are included (152).
Polyploidy is a fundamental but relatively underexplored biological process. It is widespread but little is known about how duplicate genes and genomes function in the early stages of hybridization, and how the duplicate genes maintain and diverge functions during plant evolution and crop domestication. Many polyploids are ancient, and their exact progenitors are often unknown. Resynthesized polyploids with known progenitors are excellent materials for dissecting gene expression and genomic changes in early stages and comparisons with older polyploids (28, 79, 132, 149). In addition to the parental phenotypes, polyploids give rise to phenotypes that are intermediate between the two parents and to novel phenotypes that are absent in or exceed features of the contributing parents (77, 82, 119), suggesting nonadditive gene expression. Some traits, such as increasing levels of drought tolerance, apomixis, pest resistance, flowering-time variation, and organ size, may allow polyploids to enter new niches or improve their fitness. Indeed, polyploids may survive better than their diploid progenitors in harsh environments, such as high altitudes and latitudes and cold climates, whereas both diploids and polyploids often thrive and cohabit in mild conditions (49, 135). Moreover, polyploidy is a means of permanent fixation of hybrid vigor and dosage regulation, which may be why many crops (e.g., wheat, cotton, oats, canola, potato, peanuts, sugarcane, coffee, and strawberry) are of polyploid origin (63, 93). Thus, polyploidy has been studied in the context of evolution, genetics, breeding, and molecular biology (25, 29, 50, 76, 79, 86, 95, 111, 132, 149).
Interspecific hybridization and allopolyploidization occur frequently in plant taxa including Brassica (133), Gossypium (150), Senecio (1), Spartina (10), Tragopogon (141), and Triticum (40, 130). Furthermore, hybrids can be formed between different genera including Triticum (wheat)-Secale (rye) (62), Triticum (wheat)-Hordeum (barley) (106), Zea (maize)-Avena (oat) (123), and Zea (maize)-Tripsacum (gamma grass) (58). Some allopolyploids (e.g., Tragopogon miscellus and T. mirus) were produced in natural conditions as recently as ~80 years ago, and new Tragopogon allotetraploids appear to form every year (141). In contrast, polyploids are rarer in animals than in plants (90, 107). Interspecific hybrids occur in vertebrates and mammals (e.g., a mule is a hybrid between a horse and a donkey), but they cannot produce offspring probably because of genomic incompatibility and/or imbalance in imprinting and sex chromosome dosage (107, 112). Polyspermy (fertilization of more than one sperm into one ovum) causes human triploids in 1–3% of conceptions, and the triploid fetuses are aborted (100). An isolated case of a tetraploid South American rodent (Tympanoctomy barrerae) is still debatable (44, 139). Except for endopolyploidy (a diploid individual with cells containing more than 2 C amount of DNA in their nuclei) in some cell types (37), aneuploid and polyploid cells in animals and humans are often associated with malignant cell proliferation or carcinogenesis (136).
Several mechanisms may affect the fate of orthologous and homoeologous genes in polyploids (25, 29, 76, 79, 86, 95, 111, 132, 149) (Figure 2). First, the majority of homoeologous genes are coexpressed. Second, some duplicate genes are lost, mutate, or diverge (due to genetic changes). The half-life of an active paralogous gene that becomes mutated or lost is estimated to be 2–7 million years (87). Third, epigenetic changes may reprogram gene expression and developmental patterns of new allopolyploids. The impact of these mechanisms on various polyploids can be very different. For example, sequence elimination is predominately observed in wheat and Tragopogon allopolyploids (40, 130, 141); chromosomal translocations and transposition (insertion of a DNA fragment into homoeologous chromosomes) are common in Brassica allopolyploids (133); and changes in gene expression appear to be a major consequence in Arabidopsis and cotton allopolyploids (2, 73, 147, 148). Moreover, genetic and epigenetic changes may be interrelated (25); reactivating transposons by chromatin modifications or RNA-mediated pathways may lead to chromosomal breakages and rearrangements. Eliminating DNA sequences (regulatory and/or coding regions) may alter dosage-dependent gene regulation and chromatin structure.
Elimination of chromosome- or genome-specific sequences may occur during polyploid formation (Figure 2a). The stochastic changes of duplicate genes may promote polyploid speciation, which is supported by studies in Brassica (133), wheat (40, 130), and Tragopogon(141). In the resynthesized allopolyploids, loss of parental fragments and/or the appearance of novel fragments are commonly observed. Rapid sequence elimination in the resynthesized allopolyploid wheat may account for a relatively high amount (~14%) of genome- or chromosomal-specific DNA sequences (40, 130), suggesting that differential elimination of genome-specific sequences facilitates pairing between homologous chromosomes but not homoeologous chromosomes.
Changes in DNA sequence may contribute to the loss of duplicate gene expression and function. Indeed, many isozyme loci are lost during polyploidization, such as chlorophyll a/b binding protein genes in Polystichum munitum, leucine aminopeptidase loci in tetraploid Chenopodium, and phosphoglucose isomerase loci in homosporous fern and Clarkia (48, 116). Estimates indicate that in the salmonid and cyprinid fish, the loss of duplicate isozyme loci can be as high as 35–65%, suggesting that loss of duplicate gene function is common after polyploidization, which occurred 50 million years ago (Mya) in this lineage (41). In Tragopogon, 9 of 10 genes that display expression differences are associated with changes in allelic DNA sequence (141). However, loss of gene function may also suggest an epigenetic cause (see below).
Genetic mutations can explain the cause of gene loss over evolutionary time, but many silencing phenomena may be epigenetically controlled, especially in the early stages of polyploid formation. When two different genomes are combined into a single cell, they must respond to the consequences of genome duplication, especially duplicate copies of genes with similar or redundant functions. Increased gene or genome dosage may induce disease syndromes and abnormal development (7, 38). Thus, the expression of orthologous genes must be reprogrammed through epigenetic mechanisms (Figure 2a) in the early process of polyploidization. This resembles the “genomic shock” phenomenon proposed by McClintock (98). The genomic shock occurs rapidly in interspecific hybrids and allopolyploids, resulting in demethylation of retroelements (92), relaxation of imprinting genes (20, 68, 145), and silencing and activation of homoeologous genes (2, 26, 69, 73, 91, 146-148), including rRNA genes subjected to nucleolar dominance (expression of rRNA genes from only one progenitor in an interspecific hybrid or allopolyploid) (117, 122). Epigenetic changes, which are potentially reversible, provide an effective and flexible means for a polyploid cell to respond to polyploidy or genomic shock. Moreover, gene silencing or activation that is initially epigenetic and reversible could be one step toward a genetically fixed and irreversible state. Epigenetic silencing may also accelerate sequence mutation rates of the affected genes, as observed in repeat-induced point mutations in duplicated genes of Neurospora crassa (128).
Mechanisms for epigenetic regulation of homoeologous genes in the allopolyploids are reminiscent of those for X-chromosome inactivation (75), gametic imprinting (143), paramutation (21, 121, 134), and homology-dependent gene silencing (11, 67, 94). However, ploidy-dependent gene regulation has some unique features. First, epigenetic interactions are established among four alleles of two homoeologous loci in allotetraploids compared with two alleles of one locus in a diploid. Second, homoeologous genes from different parental origins may be up- or downregulated in a chromosomal domain (73, 147), which is different from dosage compensation that often refers to concerted or unidirectional changes in gene expression. Third, at least some epigenetic silencing phenomena in allopolyploids are stochastically established and require multiple generations (24, 104, 148), probably because of the complex process of sorting out chromosome pairing in the allopolyploids. Fourth, pairing occurs mainly between homologous chromosomes, but occasionally between homoeologous chromosomes in allopolyploids (Figure 1e), which may affect gene expression. Finally, because of divergence of regulatory sequences between the progenitors and of heterologous proteins produced in the allopolyploids, cis- and trans- acting effects on homoeologous genes (146, 151) in various biological pathways constitute a major mode of gene regulation in the allopolyploids.
Many genes display dosage dependency and are expressed additively in aneuploids and polyploids (53). If the levels of gene expression and phenotypic variation in the progenitors are additive, they would have the midparent values (MPVs) in the polyploids; that is, one plus one is equal to 2 (Figure 2d). If the gene expression is nonadditive (different from the MPV), the values would be larger than 2 or smaller than 2. The former would suggest gene activation including overdominance, whereas the latter would suggest repression and/or silencing. One model to explain additive gene regulation is that there are extra control settings or rheostat potentials (four levels) in a tetraploid compared with two levels in a diploid (Figure 2b). In Arabidopsis and Brassica, the alleles in the FLOWERING LOCUS C (FLC) loci display additive effects on flowering time (102). In Arabidopsis allotetraploids, the expression of A. thaliana and A. arenosa FLC loci is additive, giving rise to a late flowering phenotype (146). Furthermore, up to ~90% of the transcriptome is expressed additively in resynthesized Arabidopsis allotetraploids (147). Although odd and even dosage effects may vary in a ploidy series (see below), coexpression and coevolution of orthologs and paralogs suggest that a selective advantage is obtained from dosage dependency (15, 47).
Similar numbers of genes (5–11%) are differentially expressed between the parents and resynthesized allotetraploids in stable allotetraploids and five selfing generations (148), and a slightly low number of genes (~2.5%) are differentially expressed between a natural allotetraploid, A. suecica, and its assumed progenitors (73). Most gene expression changes observed in early generations are maintained in the late generations and natural allotetraploids, suggesting that rapid and stochastic changes in the resynthesized allotetraploids are responsible for adaptive evolution (148).
Furthermore, allopolyploidy may induce regulatory incompatibilities as well as selective advantage by combining heterologous protein products (Figure 2c). For instance, there is evidence that some protein heterodimers may not function as well as homodimers or vice versa (115, 118). Thus, a silencing strategy could balance regulatory incompatibility and the advantages of having multiple copies of orthologous genes or gene products (e.g., transcriptional factors) spontaneously produced in an allopolyploid cell. Alternatively, novel interactions between heterologous protein products may provide a molecular basis for hybrid vigor and novel adaptation.
New species are often gradually formed because of geographical and ecological separations from an ancestral species (49). However, new species are believed to have arisen suddenly via polyploidization in plants and some animals, including vertebrates such as amphibians and lizards (18, 49, 93, 112). For example, Arabidopsis suecica (2n = 4x = 26) is a natural allotetraploid formed 12,000 to 1.5 Mya (64, 71, 125). The two progenitor species, A. thaliana and Arabidopsis arenosa (108, 123), split ~6 Mya (71), similar to the distance between humans and chimpanzees (~6.3 million years) (114). Despite this distance, A. thaliana autotetraploid (2n = 4x = 20) and A. arenosa tetraploid (2n = 4x = 32) can hybridize to produce A. suecica-like plants (2n = 4x = 26) (Figure 3a,b,c). A. arenosa is thought to be an autotetraploid (31), but sequencing analysis sugegsts that it is not a pure autotetraploid (146) (L. Tian, J. Wang & Z.J. Chen, unpublished). The resynthesized allotetraploids are meiotically stable (30, 147) and contain five pairs of A. thaliana chromosomes and eight pairs of A. arenosa chromosomes (28, 30, 147) (Figure 3c). Compared with resynthesized Brassica and wheat allopolyploids that undergo rapid changes in chromosomal structure and DNA sequences (40, 133), the frequency of aneuploids and chromosome abnormalities in Arabidopsis resynthesized allotetraploids is relatively low (30).
The nascent allotetraploids (F1 individuals) are genetically identical (Figure 3a) and showed subtle phenotypic variation. Some variation among F1 individuls may derive from heterozygosity of the outcrossing tetraploid A. arenosa parent, whereas other variation may result from interactions between A. arenosa and the different genotypes of A. thaliana used in interspecific hybridizations. The degree of variation depends on parental genotypes used in the interspecific hybridization. For example, the seed set is higher in the nascent allotetraploids (F1) between A. arenosa and A. thaliana C24 or Ler ecotype than those between A. arenosa and A. thaliana Columbia (20, 31), indicating genotypic effects on interspecific hybridization. Hybridization was successful only in the crosses using A. thaliana as a maternal parent and A. arenosa as a pollen donor (31) ( J. Wang, L. Tian & Z.J. Chen, unpublished), probably because A. arenosa is outcrossing and self-incompatible. Most F1 individuals and selfing progeny in late generations resemble the A. arenosa parent and A. suecica (31, 91, 147) (Figure 3c), although diverse phenotypes are observed in segregating populations (F2-F3) (Figure 3b). Therefore, A. arenosa appears to be phenotypically dominant over A. thaliana in the allotetraploids (28, 147).
The allotetraploids obtained from selfing the F1s show stable karyoptes in the fifth generation (Figure 3c), but exhibit a wide range of variants, some of which are absent in either parent (transgression) (Figure 4a). Moreover, the allotetraploids display hybrid vigor: larger rosettes, more leaves, longer and wider leaves, and taller plants than the parents. The fertility rate of the plants in the selfing progeny varies from one lineage to another (30, 31). The overall level of fertility improves after each generation of selfing, suggesting that genome incompatibility and gene expression divergence between the progenitors are gradually overcome (148).
The flower colors varied from pink (like A. arenosa) in the early generation (F1) to a mixture of pink and white flowers in the intermediate generations (S2–4) and white in the late generation (S5). During selfing (S3), there is a low frequency of mixed white and pink flowers in the same flower branch (Figure 4b, bottom), which is transient and mosaic (derived from the same zygote). The appearance of varigation within the same flower branch suggests rapid changes in the expression of genes involved in anthocyanin synthesis pathways probably via epigenetic regulation.
In 1928, Navashin coined the term “amphiplasty” to describe chromosomal changes in interspecific hybrids of Crepis (110). He defined “differential amphiplasty” as specific changes in a few chromosomes (disappearance of satellites or secondary constrictions) and “general amphiplasty” as the overall changes in chromosomal morphology (shortening, thickening, or lengthening of chromosomes) from one species in the interspecific hybrids or amphidiploids. Changes in chromosomal morphology might also affect gene expression. Indeed, following the pioneering work of Navishin & McClintock (97, 110), several contemporary researchers demonstrated that differential amphiplasty is synonymous to nucleolar dominance (117, 122). The disappearance of the secondary constrictions is caused by silencing of rDNA loci in those chromosomes (117). Nucleolar dominance is observed in Drosophila interspecific hybrids and Xenopus, Arabidopsis, Brassica, and wheat allopolyploids (117, 122). The dominance is reversible and developmentally regulated and is controlled by chromatin modifications involving DNA methylation and histone acetylation (26, 27). Blocking histone acetylation or DNA methylation derepresses the silenced rRNA genes subjected to nucleolar dominance. Both DNA methylation and histone hypoacetylation reinforce the formation of the “inactive” chromatin state, resulting in gene silencing (72). The silencing of rDNA chromatin requires at least one histone deacetylase (AtHDA6) that is localized in nucleoli (35).
General amphiplasty may be similar to the effects of genomic shock (98). Combining two genomes in a “new” polyploid cell may generate the genomic shock and release some constraints imposed on unstable elements locked in a junk yard (e.g., transposable elements in heterochromatin). Little is known about the consequences of general amphiplasty or genomic shock on interspecific hybrids or allopolyploids that have balanced pairs of chromosomes.
To begin to test this, Wang et al. (144) studied transcriptome divergence in Arabidopsis allotetraploids and their progenitors. First, they compared gene expression differences between the two progenitors using the spotted oligo-gene microarrays designed from ~26,000 annotated genes that share a high percentage of sequence identities between A. thaliana and A. arenosa. Most of the oligos can cross-hybridize with both A. thaliana and A. arenosa genes (74). More than 15% of the transcriptome is differentially expressed between A. thaliana and A. arenosa that diverged ~6 Mya. Approximately 2,100 genes (8%) are more abundantly expressed in A. thaliana than in A. arenosa, whereas 1,818 genes (7%) are expressed at higher levels in A. arenosa than in A. thaliana. Second, Wang et al. (144) compared mRNA abundance in an allotetraploid with the mid parental value (MPV: an equal mixture of RNAs from two parents). If the genes from two progenitors are additively expressed (Figure 2d), their cumulative expression levels in the allotetraploid are equal to MPV. Nonadditive expression suggests that at least one of the homoeologous genes is up- or downregulated. There may also be instances in which silencing of a locus is compensated by increased expression of its homoeologous locus, which cannot be detected in this comparison. Wang et al. (146) found that 2,011 genes (~8%) are nonadditively expressed in two independently derived allotetraploids using a common variance analysis, and up to ~38% genes using a pergene variance analysis. Interestingly, ~68% of the genes that are nonadditively expressed in the allotetraploids are differentially expressed between the two parents, suggesting that the genes with species-specific expression patrterns are subjected to expression changes in the allopolyploids. Remarkably, among the nonadditively expressed genes, more than 65% of the genes are downregulated in the allotetraploids. Among them, >94% of the genes that are expressed at higher levels in A. thaliana than in A. arenosa are downregulated in the allotetraploids. These data indicate that the genes with A. thaliana expression patterns tend to be repressed, whereas the genes with A. arenosa expression patterns are transcriptionally dominant in the allotetraploids, coincident with the phenotypic and nucleolar dominance of A. arenosa in the allotetraploids (117, 147) (Figures 3 and 4).
Interestingly, similar levels of transcriptional changes were observed in maize diploid hybrids (9.8%) (140) and polyploid taxa of Senecio (5%) (60), wheat (7.7%) (59), and cotton (5%) (2). The high percentage of gene expression changes in the Tragopogon allopolyploids (~17.5%) (141) is partly associated with a high level of polymorphism (~11%) within populations between the two parents and a moderate amount of variation (>2.5%) among allopolyploid populations. These numbers are also similar to those observed in interspecific hybrids of Drosophila (103, 151), suggesting that the levels of transcriptional changes induced by hybridization may be fairly consistent even across plant and animal kingdoms. Transcriptome dominance is also observed in an analysis of ~210,000 expressed sequence tags (ESTs) derived from an ovular cDNA library of tetraploid cotton (Gossypium hirsutum L.) (153). The upland cotton was formed by ancient interspecific hybridization between AA and DD genome species (150). AA subgenome ESTs of all functional classifications including cell cycle control and transcription factor activity were selectively enriched in G. hirsutum L., a result consistent with the production of long lint fibers in AA genome species. Therefore, transcriptome dominance is likely a general consequence of hybridization effects on gene expression in interspecific hybrids and allopolyploids.
The number of genes displaying expression changes in A. thaliana autopolyploids is much smaller than that in the allotetraploids (147). In yeast, Galitski et al. (43) found that 10 genes are induced and seven genes are reduced in response to an increase in ploidy levels (haploid, diploid, triploid, and tetraploid). The cell size increases with increasing ploidy levels, which is correlated with repression of G1 cyclins (Cln1 and Pcl1). FLO11, a gene important to the invasiveness of the yeast cells, is repressed with increasing ploidy levels. The reduction of FLO11 expression in cells of higher ploidy is correlated with diminished invasion, suggesting a role of ploidy-dependent gene regulation in adaptive evolution. Collectively, the data indicate that genome doubling has smaller effects on gene expression changes than intergenomic hybridization.
What factors affect transcriptome dominance in the allopolyploids? Is the gene repression controlled by widespread chromatin modifications or a few “key” regulatory genes? Over time, the progenitor species may have evolved to possess species-specific gene expression patterns. Modulation of the species-specific expression of these genes may determine the outcome of transcriptional and posttranscriptional competition between the two parental genomes in their offspring. Changes in chromatin landscape on repressed genes may result from concerted modifications of many genes in one species, perhaps by a mechanism similar to that for nucleolar dominance (117). Alternatively, expression changes in a few regulatory genes such as transcription factors and microRNAs may induce trans-acting effects on many downstream pathways (25). For example, a Myb transcription factor gene is responsible for hybrid-induced incompatibilities in Drosophila interspecific hybrids (8). Also, a single miRNA can regulate hundreds of genes involved in the transition from one developmental stage to another (39).
The role of chromatin modifications in silencing or activating protein-coding genes in allopolyploids has been demonstrated in several recent studies (70, 73, 91, 104, 146, 148). Silenced genes can be reactivated by aza-dC (73), a chemical inhibitor of DNA methylation, or by downregulation of the genes encoding DNA methyltransferases using RNA interference (RNAi) (148). Treating allotetraploids with aza-dC generates pleiotropic effects on natural and synthetic allotetraploids including reactivation of mobile elements (91). Reactivation of transposons is also observed in the synthetic allotetraploids (92). The above data suggest that two species may have possessed different levels of chromatin modifications for many genes that display species-specific expression patterns. Perturbation of chromatin structure may have occurred during the formation of interspecific hybrids or allopolyploids, leading to the changes in gene expression.
Factors other than chromatin modifications may also be responsible for genomewide nonadditive gene regulation. Non-additively expressed genes are randomly distributed along the chromosomes (147). Within a small chromosomal region in which TCP3 and RFP genes are located, A. thaliana TCP3 is expressed, whereas A. arenosa TCP3 is silenced (73). For RFP, A. thaliana RFP is repressed, whereas A. arenosa RFP is expressed. Interestingly, the neighboring genes located between TCP3 and RFP loci are coexpressed. The above data are reminiscent of the silenced rRNA genes that are restricted in the rDNA loci (81). Furthermore, in the met1-RNAi A. suecica lines, several silenced genes are not reactivated (148). These data argue that widespread chromatin remodeling does not explain nonadditive regulation for all genes, but support the notion that each gene is regulated through interactions among homoeologous loci such as paramutation-like phenomena observed in A. thaliana tetraploids (104).
Genome-wide nonadditive gene regulation observed in the allotetraploids correlates with expression divergence between the parents. Thus, hybrids derived from distantly related species may induce a high level of gene expression changes in a nonadditive fashion, providing molecular bases of hybrid vigor (13) and phenotypic variation in the allotetraploid progeny (31). Hybrid vigor refers to the performance of an F1 hybrid higher than MPV or the best parent. The genetic basis for heterosis is predicted to be associated with dominant complementation of slightly deleterious recessives (dominance model) (19, 66) or overdominant gene action in which genes have greater expression in heterozygous conditions (overdominance model) (32, 36). According to the dominance model, highest performance should be observed when all dominant favorable genes from both parents are in homozygous conditions. The overdominance model suggests that heterosis should reach its peak at the maximum levels of heterozygosity and dissipate when approaching homozygosity. Moreover, overdominance is accompanied by nonallelic or epistatic interactions, and epistasis is involved in most QTLs associated with inbreeding depression and heterosis in corn (137) and rice (84). Comparing genome-wide gene expression data with phenotypic traits (QTLs) may provide new insights into the role of gene expression changes in various biological pathways that give rise to hybrid vigor.
The gene expression changes observed in maize diploid hybrids (6, 140) and genomewide transcriptome dominance in Arabidopsis allotetraploids (147, 148) support both dominance and overdominance models. Many genes in energy, metabolism, cellular biogenesis, and plant hormonal regulation are upregulated in the allotetraploids (147), which may contribute to the hybrid vigor observed in the allotetraploids. Although the underlying mechanisms are unknown, one possibility is modulation of a few key regulators in the allotetraploids that may control downstream genes in various biological pathways (146, 147) such as photosynthesis and metabolism (Z. Ni & Z.J. Chen, unpublished).
Alternatively, cis- and trans-acting effects involving regulatory sequence changes (see below), chromatin modifications, and RNA-mediated pathways (25) (Figure 2a) may explain dominance, overdominance, and epistasis. The interactions between the diverged orthologous protein products may determine repression or activation of progenitors’ genes in allopolyploids (Figure 2c) of Arabidopsis (148), cotton (2), Senecio (60), and wheat (59, 69), interspecific hybrids (151) in Drosophila, intraspecific diploid hybrids in maize (6, 54, 140), and sex-dependent gene regulation in Drosophila (46, 120). These mechanisms are not mutually exclusive, and the diverged protein-protein and protein-DNA interactions in allopolyploids may trigger repression of the protein-coding genes and rDNA loci derived from one progenitor (e.g., A. thaliana) via chromatin modifications (26, 73, 146, 148) or novel expression patterns leading to hybrid vigor (Figure 2c).
Stable allopolyploids provide an excellent system for testing cis- and trans-acting effects because a common set of protein factors is present in the same allotetraploid cells. After the unification of the distinct genomes, differences in cis- and trans-regulation contribute to changes in the expression of orthologs that become homoeologous pairs in the allopolyploid or interspecific nucleus (146, 151). Cis-regulatory divergence directly acts on single genes or localized chromatin domains such as promoters or enhancers and may result in asymmetric accumulation of homoeologous transcripts in allpolyploids.
There is evidence for cis- and trans-effects on orthologous or homoeologous genes in the allotetraploids (146) and interspecific hybrids (33, 151). Differential expression of progenitors’ genes in Arabidopsis allopolyploids (73, 148) and interspecific hybrids (33), Drosophila interspecific hybrids (151), and maize diploid hybrids (138) is mainly caused by cis-regulatory changes. Progenitor-specific differences in expression in the same cells are most likely due to allelic or epigenetic differences. In contrast, expression divergence due to alterations in trans-regulatory hierarchies should result in two kinds of expression changes. The first is a difference in the sum of homoeologous mRNAs compared with the mid-value of the two parents or nonadditive gene expression. Indeed, the divergently expressed orthologs comprise ~68% of the genes that were expressed in a nonadditive fashion in two allotetraploids (146, 147), implicating trans-acting effects. The second is a change in the ratio of homoeolog-encoded mRNAs in an allopolyploid compared with the ratio of the two orthologs in a 1:1 mixture of the parental mRNAs (147). Such a difference would demonstrate a regulatory interaction between the parental genomes (for example, the failure of an interspecific heterodimer to activate transcription) (25) (Figure 2c). A regulatory hierarchy (12) model suggests that trans-regulatory differences predominate in allopolyploids (25).
The species-specific expression patterns observed in Arabidopsis allotetraploids (147) may result from sequence divergence at regulatory elements during the ~6 million years that separate the parental species. Cis- and trans-acting regulation and epigenetic modifications of homoeologous genes may change regulatory interactions in a biological pathway (Figure 5a). This has been demonstrated in a subset of genes controlling flowering-time variation. A. arenosa and A. thaliana (Ler) diverged in flowering habits probably because of selective adaptation to cold and warm climates (108, 123), respectively. Natural variation of flowering time is largely controlled by two epistatically acting loci, namely, FRIGIDA (FRI) (65) and FLOWERING LOCUS C (FLC) (102, 131). FRI upregulates FLC expression that represses flowering in A. thaliana. A. thaliana has a nonfunctional AtFRI (65), whereas A. arenosa FRI (AaFRI) is functional (146). Compared with A. thaliana (AtFLC), A. arenosa FLC (AaFLC) loci possess deletions in the promoter and first intron that are important to cis-regulation of FLC expression. In resynthesized allotetraploids, AaFRI complements nonfunctional AtFRI and interacts in trans with AtFLC, making the synthetic allotetraploids winter annual in a dosage-dependent manner. AaFRI acts on AtFLC in trans and on AaFLC in cis because A. thaliana FRI is nonfunctional. The different effects of AaFRI on AtFLC and AaFLC loci are likely dependent on the sequence divergence in their cis-regulatory elements (e.g., deletions in the promoter and first intron). AtFLC and AaFLC upregulation is mediated by H3-K9 acetylation and H3-K4 methylation, suggesting a role of FRI in locus-specific chromatin modifications (146).
Although our model (Figure 5a) simplifies the flowering pathway that involves >80 genes (80), it offers one explanation of the fate of orthologous genes involved in biological pathways during allopolyploidization. Many orthologous genes might have diverged in their cis-regulatory elements that confer strong or weak, dominant or recessive alleles, tissue-specific expression, and/or developmental regulation. The regulatory networks may be reset by cis- and trans-acting effects via chromatin modification immediately after allopolyploidization. Over generations, genetic and epigenetic changes are subject to selection and adaptation, and additional genes (e.g., MAF, a FLC-MAF family member of MADS-box genes, in A. suecica) (146) may be activated for allopolyploids to occupy an environmental niche. A similar mechanism may be responsible for the functional diversification of orthologous genes in developmental regulation of gene expression, a phenomenon known as subfunctionalization of duplicate genes (88). Flowering time directly affects plant reproduction and adaptation. Therefore, sequence evolution and epigenetic regulation play interactive and pervasive roles in mediating the regulatory incompatibilities between divergent genomes, leading to natural variation and selective adaptation during allopolyploid evolution.
Some gene expression variation observed in polyploids may be controlled at the level of post-transcriptional regulation (70). Silencing a duplicate copy of homoeologous RNA in polyploids may be part of an RNA-mediated pathway similar to cosuppression (67, 94) or RNAi (42). Silencing of transgenes is correlated with transgene dosage in Drosophila (113) and ploidy levels in Arabidopsis (105). Activation of Wis 2–1A retrotransposon in the newly synthesized wheat allotetraploids drives the readout transcripts from adjacent sequences including the antisense or sense strands of known genes, leading to the silencing or activation of respective corresponding genes (70). RNA-mediated silencing of duplicate genes in polyploids is a developmental strategy. Production of progenitor-dependent RNA transcripts may be associated with mRNA accumulation and stability during growth and development (55). For example, a subset of genes involved in mRNA stability displayed expression variation in allotetraploids (E. Kim & Z.J. Chen, unpublished). CCR4, a gene involved in RNA stability and degradation in yeast and animals (34), is differentially accumulated in leaves and flower buds, suggesting a role of RNA stability in transcript accumulation in allopolyploids.
Over time, species may have adapted to spatial and temporal regulation of RNA transcripts including mRNAs, small RNAs, and additional noncoding RNA transcripts that could accumulate nonadditively in the allotetraploids (Figure 5b). The small RNAs may serve as negative regulators for the expression of target genes originating from two parents. First, the loci encoding miRNAs and siRNAs may diverge during the evolution of the progenitors as observed in the FRI and FLC loci (146). Sequence divergence in promoter regions and cis-acting elements leads to the expression variation when these loci are present in the same cell nuclei. Alternatively, differential expression of trans-acting factors may cause gene expression changes in the allopolyploids, as predicted (25). Second, antiviral RNAi genes involved in the biogenesis of small RNAs such as dicers, Argonaute, and RNA-dependent RNA polymerases (9, 155) generally diverge faster than other proteins during evolution (106). Combining two divergent proteins in the allotetraploids may alter enzymatic activity and specificity. As a result, different pools of small RNAs could accumulate in the allopolyploids. Third, natural sense and anti-sense transcripts and other read-through transcripts may participate in defense mechanisms (16) and may be accumulated differently in the progenitors. These transcripts affect the expression of neighboring loci as well as other loci via trans-acting effects. Fourth, mRNA transcript abundance is different in the progenitors. Although the coding sequences are very similar among Arabidopsis and its related species, sequences at the noncoding regions (5′- and 3′-ends) diverge relatively rapidly (L. Tian, M. Ha & Z.J. Chen, unpublished), which may affect processing and stability of RNA transcripts (55). Finally, each species is differentiated by the presence or absence of species-specific repetitive DNA sequences, including transposons that may affect the chromatin structure and expression of their neighboring genes. Differences in DNA replication and perturbation of chromatin structures among different species may induce the release of transposons and aberrant RNA transcripts that cause “genomic shock” and many downstream effects, as previously predicted (25, 98).
During polyploid evolution, both copies of orthologous genes may remain if dosage effects are advantageous (142), or one copy of the gene duplicate may evolve a novel function via neofunctionalization (89). Alternatively, both copies may diverge their functions or expression patterns in different organs or tissues via subfunctionalization (88). Indeed, silenced rRNA genes in vegetative tissues are reactivated during flower development (27). In a survey of 40 genes in cotton, 10 genes (25%) display unequal expression in allotetraploids and exhibit organ-specific expression patterns (2). For 5 genes, the A-subgenome loci are expressed higher than the D-homoeologous loci, whereas for the other 4 genes, the D-subgenome loci are expressed higher than the A-homoeologous loci. For some homoeologous gene pairs, one locus (e.g., AdhA) is silenced in one organ, whereas the other locus is silenced in another organ. This silencing scheme is genotype-independent and occurs in both synthetic and natural cotton allotetraploids (3), suggesting rapid subfunctionalization of duplicate genes and stable maintenance of tissue-specific expression patterns during evolution.
Although the mechanisms for developmental control of the expression of orthologous genes are unclear, developmental regulation of orthologous genes immediately after allopolyploid formation suggests that duplicate genes provide genetic robustness against null mutations (52) and dosage-dependent selective advantage (15, 142). Moreover, immediate divergence in the expression of orthologous genes in allopolyploids provides a virtually inexhaustible reservoir for generating genetic variation and phenotypic diversification, which facilitates natural selection and adaptive evolution.
Dosage-dependent gene regulation shows odd and even effects, which may affect additive and nonadditive gene regulation in polyploids. Using B-A chromosome translocation lines in maize, Birchler and his colleagues (53) generated a series of lines with different doses of A chromosomes that could be used to measure gene expression in response to changes in chromosome dosage. Gene expression levels are generally positively correlated with the dosage of the genes or chromosomes in these lines. However, the expression levels of ~10% genes are either reduced or negatively correlated with odd chromosome dosages (e.g., one, three, and five). One possibility is that dosage-dependent gene regulation is associated with chromosome pairing because one or more copies of chromosomes in odd dosages cannot pair properly.
The odd and even effects on gene regulation are also observed in the study of transgene expression in diploid and triploid hybrids derived from the crosses of diploid or tetraploid plants with a diploid strain containing a single copy of a transgenic resistance gene in an active state (105). The expression of the transgene is reduced in the triploids compared with the diploid hybrids, leading to the loss of the resistant phenotype at various stages of seedling development in some individuals. The reduction of gene expression was reversible under selective tissue culture conditions. This type of suppression was observed for a single-copy insert in the absence of other trans-acting copies of the transgene and is therefore different from homology-dependent gene silencing. An increase in ploidy or chromosome dosage can give rise to epigenetic gene silencing, generating stochastic variations in gene expression patterns. Although the expression of the transgene in a haploid or a pentaploid was not studied, odd ploidy may result in a new type of epigenetic repression. The expression of the transgene is repressed only in the triploids in which one set of chromosomes is likely not paired or improperly paired.
Ploidy-dependent gene regulation suggests a sensing mechanism for gene dosage and DNA content via chromosome pairing. Although somatic pairing has not been documented in plants, such transient pairing has been observed in humans, Drosophila, and yeast (101). Homologous pairing has been implicated in transvection, position-effect variegation, and transgene gene silencing (5, 61, 113), all of which involve alterations in gene expression.
Paramutation is the result of heritable changes in gene expression that occur upon interaction between alleles (21). The phenomenon was first discovered in plants and later found in many other organisms including mammals (mouse and human) (21, 121, 134). The paramutagenic allele induces the change in the expression state of the paramutable allele. A paramutation-like phenomenon was also discovered in the tetraploid plants containing active and inactive transgene alleles of hygromycin phosphotransferase (HPT) (104). Active alleles that are trans-inactivated by their silenced counterparts are observed in tetraploid but not in diploid plants, and this occurred only in progeny resulting from self-fertilization of plants heterozygous for the active and inactive HPT allele. The occurrence of transgene paramutation only in tetraploid plants indicates that active and inactive alleles go through meiosis together. This led to the hypothesis of pairing-based trans-inactivation. This predication is consistent with observations in tetraploid tomato, where the frequency of paramutation of a specific paramutagenic allele at the sulfurea locus is different between diploid, triploid, and tetraploid plants and depends on the ratio of paramutagenic to paramutable alleles (57). This suggests a counting mechanism for polyploidy-dependent paramutation, which may be similar to that for X-chromosome inactivation (75).
A paramutation-like phenomenon occurs in the progeny of genetic crosses between heterozygotes and between heterozygotes and wild-type mice independent of gender combination (121). The phenomenon is speculated to be associated with aberrant RNAs resulting from the paramutagenic allele that are packaged in sperm and cause paramutation upon transmission to the next generation. Indeed, paramutation depends on a RNA-dependent RNA polymerase; the rdr101 mutation prevents paramutation in maize (4). However, paramutation in Arabidopsis tetraploids is probably not associated with RNA because trans-activation does not occur in the F1 generation (104). Moreover, crosses of decrease in the DNA methylation (ddm1) mutant with a paramutable tetraploid do not change paramutation phenotypes in the F1 or F2 but do in the F3 family, which is consistent with the gradual loss of DNA methylation by ddm1. The data suggest that methylation occurs later, and is speculated to occur during physical contact of the epialleles during meiosis and after the silencing is established (104). Alternatively, sorting out pairing between homologous and homoeologous chromosomes in polyploids may require a few more rounds of meiosis.
Many paramutation phenomena are associated with repeated sequences (21, 134). Multicopy genes or repetitive intergenic regions are a major trigger for the formation of silenced chromatin. Repeated sequences, whether inverted or tandem, can give rise to the production of dsRNA, an important trigger for RNA silencing as well as heterochromatin formation (85). In addition, repetitive sequences are also able to associate physically with their homologs in nonmeiotic cells (134). It is conceivable that different repeat sequences originating in the progenitors may trigger abnormal siRNA production and heterochromation formation that are responsible for paramutation-like or other epigenetic phenomena in allopolyploids.
Incompatibility between alien cytoplasmic and nuclear genomes and between alien nuclear genomes is believed to be a barrier leading to reproductive isolation, speciation, and developmental abnormalities in vertebrates and plants (18, 77, 78, 82, 112). Breaking down this barrier is essential in forming a new polyploid species (78). Seed fertility may be controlled by a few genes or many genetic loci. Three imprinted genes, PHERES1, MEIDOS, and MEDEA, are silenced in allotetraploids in a dosage-dependent manner (68). Disrupting maternal imprinting of AtPHERES1 and paternal imprinting of MEDEA may reduce seed viability in the allopolyploids. Imbalance of paternally and maternally imprinted genes in the endosperm may also cause reproductive failures (20).
Another factor affecting seed fertility is the breeding system. Self-incompatibility is a mechanism for preventing inbreeding in many plant species (22). The sporophytic self-incompatibility system in the family Brassicaceae has been used as a model system to study mating system evolution in plants (99, 108). A. arenosa is an outbreeder (self-incompatible), and A. thaliana is an inbreeding plant (self-compatible). They diverged from the same ancestor ~6 Mya (67, 108). However, the natural allotetraploid A. suecica and the resynthesized allopolyploids are self-compatible (Figure 6), suggesting that the mating system switches immediately following polyploidization. The loss of self-incompatibility in the first generation of allotetraploids is not caused by the segregation of S-alleles in the allotetraploids because all possible alleles are present. The data suggest rapid epigenetic changes in the expression of the genes important to self-incompatibility, probably including the well-characterized loci encoding S-locus receptor kinase (SRK) and S-locus cysteine-rich (SCR) proteins in Brassicaceae (109, 127).
In many cases, polyploidization converts self-incompatible diploids into self-compatible tetraploids in Nicotiana and Solanum with a gametophytic system, and some allopolyploids become self-compatible regardless of the mating types of their parents (18, 49, 82). Selfing in the allopolyploids may have an advantage for adapting new allopolyploids because of increased levels of heterozygosity. Inbreeding depression (1/18 or ~5% of homozygosity in one selfing generation) in allopolyploids is relatively low compared with that in a diploid (50%). An extreme form of reproductive modification is apomixis that is commonly associated with polyploidy (18, 49). Resynthesized allopolyploids may be released from reproductive failure if they are capable of vegetative or seed apomixis.
Theoretical prediction suggests that one copy of a gene duplicate would become lost by accumulation of deleterious mutations over an evolutionary timescale (87). Evidently, many duplicate genes are retained during evolution, and the redundancy conferred by duplicate genes may facilitate species adaptation (107) and genetic robustness (52) against changes in environmental conditions and developmental programs. Gene expression analyses indicate that duplicate genes offer genetic robustness against null mutations in yeast (52) and tend to experience expression divergence during development and to evolve faster between Drosophila species and within yeast species than single-copy genes (51, 83). Using all duplicate gene pairs derived from a recent WGD (14, 17, 144) and gene expression microarrays in A. thaliana (126), Ha et al. (56) found that expression divergence between gene duplicates is significantly higher in response to external stresses than to internal developmental changes. Rapid divergence between gene duplicates in response to abiotic and biotic stresses may facilitate subfunctionalization (88), neofunctionalization (89), and the evolution of an adaptive mechanism to environmental changes (98, 129). A relatively slow rate of expression divergence between the duplicates may provide dosage-dependent selective advantage (15, 142) and enable organisms to fine-tune complex regulatory networks. Orthologous and homoeologous genes in allopolyploids may have similar evolutionary fates.
The interactions between cytoplasm-nuclear and nuclear-nuclear genomes in the allopolyploids may induce genomic shock (98) and general amphiplasty (110) that are manifested by differential accumulation of transcripts originating from divergent species, leading to transcriptome dominance and activation or silencing of one or both homoeologous loci through genetic and/or epigenetic mechanisms. As a consequence, allopolyploids display hybrid vigor, flowering-time variation, inbreeding, apomixis, and selective advantage. Over time, orthologous or homoeologous loci in the allopolyploids may diverge their functions via neofunctionalization and subfunctionalization (88, 89), as predicted for the paralogous loci (see Sidebar: Adaptive Evolution and Expression Divergence Between Duplicate Genes).
In polyploid populations of separate origin, one population may lose function from one copy of an orthologous gene, while a second population may lose function from a second copy of this ortholog. This “reciprocal silencing” of duplicated genes in polyploid genomes would ultimately lead to hybrid lethality (8), promoting reproductive isolation and the origin of new species. Following this model, the stochastic silencing and subfunctionalization of orthologous genes in different lineages of allopolyploids (148) may play a major role in the origin of new species. Together, these mutually inclusive mechanisms may contribute significantly to the adaptation potential, domestication, and evolution of polyploid plants.
Addressing the following questions is essential to illuminate new insights into molecular and evolutionary impact on polyploid formation and speciation.
I thank Jianlin Wang, Hyeon-See Lee, Lu Tian, Zhongfu Ni, Misook Ha, Letricia Nogueira, Eun-Deok Kim, Jinsuk Lee, and Suk-Hwan Yang for their contributions to the gene expression data, and Donald Levin, Andrew Woodward, and two anonymous reviewers for critical reading and suggestions to improve the manuscript. I apologize for not citing many enlightening reviews and papers published in this exciting field owing to space limitations. The work was supported by the grants from the National Science Foundation (MCB0608602, DBI0501712 and DBI0624077) and the National Institutes of Health (GM067015).