Several members of the R2R3-MYB family of transcription factors act as regulators of lignin and phenylpropanoid metabolism during wood formation in angiosperm and gymnosperm plants. The angiosperm Arabidopsis has over one hundred R2R3-MYBs genes; however, only a few members of this family have been discovered in gymnosperms.
We isolated and characterised full-length cDNAs encoding R2R3-MYB genes from the gymnosperms white spruce, Picea glauca (13 sequences), and loblolly pine, Pinus taeda L. (five sequences). Sequence similarities and phylogenetic analyses placed the spruce and pine sequences in diverse subgroups of the large R2R3-MYB family, although several of the sequences clustered closely together. We searched the highly variable C-terminal region of diverse plant MYBs for conserved amino acid sequences and identified 20 motifs in the spruce MYBs, nine of which have not previously been reported and three of which are specific to conifers. The number and length of the introns in spruce MYB genes varied significantly, but their positions were well conserved relative to angiosperm MYB genes. Quantitative RTPCR of MYB genes transcript abundance in root and stem tissues revealed diverse expression patterns; three MYB genes were preferentially expressed in secondary xylem, whereas others were preferentially expressed in phloem or were ubiquitous. The MYB genes expressed in xylem, and three others, were up-regulated in the compression wood of leaning trees within 76 hours of induction.
Our survey of 18 conifer R2R3-MYB genes clearly showed a gene family structure similar to that of Arabidopsis. Three of the sequences are likely to play a role in lignin metabolism and/or wood formation in gymnosperm trees, including a close homolog of the loblolly pine PtMYB4, shown to regulate lignin biosynthesis in transgenic tobacco.
Lignin is a phenolic heteropolymer in secondary cell walls that plays a major role in the development of plants and their defense against pathogens. The biosynthesis of monolignols, which represent the main component of lignin involves many enzymes. The cinnamyl alcohol dehydrogenase (CAD) is a key enzyme in lignin biosynthesis as it catalyzes the final step in the synthesis of monolignols. The CAD gene family has been studied in Arabidopsis thaliana, Oryza sativa and partially in Populus. This is the first comprehensive study on the CAD gene family in woody plants including genome organization, gene structure, phylogeny across land plant lineages, and expression profiling in Populus.
The phylogenetic analyses showed that CAD genes fall into three main classes (clades), one of which is represented by CAD sequences from gymnosperms and angiosperms. The other two clades are represented by sequences only from angiosperms. All Populus CAD genes, except PoptrCAD 4 are distributed in Class II and Class III. CAD genes associated with xylem development (PoptrCAD 4 and PoptrCAD 10) belong to Class I and Class II. Most of the CAD genes are physically distributed on duplicated blocks and are still in conserved locations on the homeologous duplicated blocks. Promoter analysis of CAD genes revealed several motifs involved in gene expression modulation under various biological and physiological processes. The CAD genes showed different expression patterns in poplar with only two genes preferentially expressed in xylem tissues during lignin biosynthesis.
The phylogeny of CAD genes suggests that the radiation of this gene family may have occurred in the early ancestry of angiosperms. Gene distribution on the chromosomes of Populus showed that both large scale and tandem duplications contributed significantly to the CAD gene family expansion. The duplication of several CAD genes seems to be associated with a genome duplication event that happened in the ancestor of Salicaceae. Phylogenetic analyses associated with expression profiling and results from previous studies suggest that CAD genes involved in wood development belong to Class I and Class II. The other CAD genes from Class II and Class III may function in plant tissues under biotic stresses. The conservation of most duplicated CAD genes, the differential distribution of motifs in their promoter regions, and the divergence of their expression profiles in various tissues of Populus plants indicate that genes in the CAD family have evolved tissue-specialized expression profiles and may have divergent functions.
The genus Populus includes poplars, aspens and cottonwoods, which will be collectively referred to as poplars hereafter unless otherwise specified. Poplars are the dominant tree species in many forest ecosystems in the Northern Hemisphere and are of substantial economic value in plantation forestry. Poplar has been established as a model system for genomics studies of growth, development, and adaptation of woody perennial plants including secondary xylem formation, dormancy, adaptation to local environments, and biotic interactions.
As part of the poplar genome sequencing project and the development of genomic resources for poplar, we have generated a full-length (FL)-cDNA collection using the biotinylated CAP trapper method. We constructed four FLcDNA libraries using RNA from xylem, phloem and cambium, and green shoot tips and leaves from the P. trichocarpa Nisqually-1 genotype, as well as insect-attacked leaves of the P. trichocarpa × P. deltoides hybrid. Following careful selection of candidate cDNA clones, we used a combined strategy of paired end reads and primer walking to generate a set of 4,664 high-accuracy, sequence-verified FLcDNAs, which clustered into 3,990 putative unique genes. Mapping FLcDNAs to the poplar genome sequence combined with BLAST comparisons to previously predicted protein coding sequences in the poplar genome identified 39 FLcDNAs that likely localize to gaps in the current genome sequence assembly. Another 173 FLcDNAs mapped to the genome sequence but were not included among the previously predicted genes in the poplar genome. Comparative sequence analysis against Arabidopsis thaliana and other species in the non-redundant database of GenBank revealed that 11.5% of the poplar FLcDNAs display no significant sequence similarity to other plant proteins. By mapping the poplar FLcDNAs against transcriptome data previously obtained with a 15.5 K cDNA microarray, we identified 153 FLcDNA clones for genes that were differentially expressed in poplar leaves attacked by forest tent caterpillars.
This study has generated a high-quality FLcDNA resource for poplar and the third largest FLcDNA collection published to date for any plant species. We successfully used the FLcDNA sequences to reassess gene prediction in the poplar genome sequence, perform comparative sequence annotation, and identify differentially expressed transcripts associated with defense against insects. The FLcDNA sequences will be essential to the ongoing curation and annotation of the poplar genome, in particular for targeting gaps in the current genome assembly and further improvement of gene predictions. The physical FLcDNA clones will serve as useful reagents for functional genomics research in areas such as analysis of gene functions in defense against insects and perennial growth. Sequences from this study have been deposited in NCBI GenBank under the accession numbers EF144175 to EF148838.
Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set.
Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10-9 synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution.
Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations.
Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling.
To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago.
Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed plants.
Angiosperm; duplication; evolution; gene families; genetic map; gymnosperm; phylogenomics; Picea; spruce; structural genomics
Phenylalanine ammonia lyase (PAL) is a key enzyme of the phenylpropanoid pathway that catalyzes the deamination of phenylalanine to trans-cinnamic acid, a precursor for the lignin and flavonoid biosynthetic pathways. To date, PAL genes have been less extensively studied in gymnosperms than in angiosperms. Our interest in PAL genes stems from their potential role in the defense responses of Pinus taeda, especially with respect to lignification and production of low molecular weight phenolic compounds under various biotic and abiotic stimuli. In contrast to all angiosperms for which reference genome sequences are available, P. taeda has previously been characterized as having only a single PAL gene. Our objective was to re-evaluate this finding, assess the evolutionary history of PAL genes across major angiosperm and gymnosperm lineages, and characterize PAL gene expression patterns in Pinus taeda.
We compiled a large set of PAL genes from the largest transcript dataset available for P. taeda and other conifers. The transcript assemblies for P. taeda were validated through sequencing of PCR products amplified using gene-specific primers based on the putative PAL gene assemblies. Verified PAL gene sequences were aligned and a gene tree was estimated. The resulting gene tree was reconciled with a known species tree and the time points for gene duplication events were inferred relative to the divergence of major plant lineages.
In contrast to angiosperms, gymnosperms have retained a diverse set of PAL genes distributed among three major clades that arose from gene duplication events predating the divergence of these two seed plant lineages. Whereas multiple PAL genes have been identified in sequenced angiosperm genomes, all characterized angiosperm PAL genes form a single clade in the gene PAL tree, suggesting they are derived from a single gene in an ancestral angiosperm genome. The five distinct PAL genes detected and verified in P. taeda were derived from a combination of duplication events predating and postdating the divergence of angiosperms and gymnosperms.
Gymnosperms have a more phylogenetically diverse set of PAL genes than angiosperms. This inference has contrasting implications for the evolution of PAL gene function in gymnosperms and angiosperms.
WRKY III genes have significant functions in regulating plant development and resistance. In plant, WRKY gene family has been studied in many species, however, there still lack a comprehensive analysis of WRKY III genes in the woody plant species poplar, three representative lineages of flowering plant species are incorporated in most analyses: Arabidopsis (a model plant for annual herbaceous dicots), grape (one model plant for perennial dicots) and Oryza sativa (a model plant for monocots).
In this study, we identified 10, 6, 13 and 28 WRKY III genes in the genomes of Populus trichocarpa, grape (Vitis vinifera), Arabidopsis thaliana and rice (Oryza sativa), respectively. Phylogenetic analysis revealed that the WRKY III proteins could be divided into four clades. By microsynteny analysis, we found that the duplicated regions were more conserved between poplar and grape than Arabidopsis or rice. We dated their duplications by Ks analysis of Populus WRKY III genes and demonstrated that all the blocks were formed after the divergence of monocots and dicots. Strong purifying selection has played a key role in the maintenance of WRKY III genes in Populus. Tissue expression analysis of the WRKY III genes in Populus revealed that five were most highly expressed in the xylem. We also performed quantitative real-time reverse transcription PCR analysis of WRKY III genes in Populus treated with salicylic acid, abscisic acid and polyethylene glycol to explore their stress-related expression patterns.
This study highlighted the duplication and diversification of the WRKY III gene family in Populus and provided a comprehensive analysis of this gene family in the Populus genome. Our results indicated that the majority of WRKY III genes of Populus was expanded by large-scale gene duplication. The expression pattern of PtrWRKYIII gene identified that these genes play important roles in the xylem during poplar growth and development, and may play crucial role in defense to drought stress. Our results presented here may aid in the selection of appropriate candidate genes for further characterization of their biological functions in poplar.
This article was reviewed by Prof Dandekar and Dr Andrade-Navarro.
Electronic supplementary material
The online version of this article (doi:10.1186/s13062-015-0076-3) contains supplementary material, which is available to authorized users.
WRKY III; Microsynteny; Gene duplication; Expression; Populus
Members of the pine family (Pinaceae), especially species of spruce (Picea spp.) and pine (Pinus spp.), dominate many of the world's temperate and boreal forests. These conifer forests are of critical importance for global ecosystem stability and biodiversity. They also provide the majority of the world's wood and fiber supply and serve as a renewable resource for other industrial biomaterials. In contrast to angiosperms, functional and comparative genomics research on conifers, or other gymnosperms, is limited by the lack of a relevant reference genome sequence. Sequence-finished full-length (FL)cDNAs and large collections of expressed sequence tags (ESTs) are essential for gene discovery, functional genomics, and for future efforts of conifer genome annotation.
As part of a conifer genomics program to characterize defense against insects and adaptation to local environments, and to discover genes for the production of biomaterials, we developed 20 standard, normalized or full-length enriched cDNA libraries from Sitka spruce (P. sitchensis), white spruce (P. glauca), and interior spruce (P. glauca-engelmannii complex). We sequenced and analyzed 206,875 3'- or 5'-end ESTs from these libraries, and developed a resource of 6,464 high-quality sequence-finished FLcDNAs from Sitka spruce. Clustering and assembly of 147,146 3'-end ESTs resulted in 19,941 contigs and 26,804 singletons, representing 46,745 putative unique transcripts (PUTs). The 6,464 FLcDNAs were all obtained from a single Sitka spruce genotype and represent 5,718 PUTs.
This paper provides detailed annotation and quality assessment of a large EST and FLcDNA resource for spruce. The 6,464 Sitka spruce FLcDNAs represent the third largest sequence-verified FLcDNA resource for any plant species, behind only rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana), and the only substantial FLcDNA resource for a gymnosperm. Our emphasis on capturing FLcDNAs and ESTs from cDNA libraries representing herbivore-, wound- or elicitor-treated induced spruce tissues, along with incorporating normalization to capture rare transcripts, resulted in a rich resource for functional genomics and proteomics studies. Sequence comparisons against five plant genomes and the non-redundant GenBank protein database revealed that a substantial number of spruce transcripts have no obvious similarity to known angiosperm gene sequences. Opportunities for future applications of the sequence and clone resources for comparative and functional genomics are discussed.
There is a rapidly growing awareness that plant peptide signalling molecules are numerous and varied and they are known to play fundamental roles in angiosperm plant growth and development. Two closely related peptide signalling molecule families are the CLAVATA3-EMBRYO-SURROUNDING REGION (CLE) and CLE-LIKE (CLEL) genes, which encode precursors of secreted peptide ligands that have roles in meristem maintenance and root gravitropism. Progress in peptide signalling molecule research in gymnosperms has lagged behind that of angiosperms. We therefore sought to identify CLE and CLEL genes in gymnosperms and conduct a comparative analysis of these gene families with angiosperms.
We undertook a meta-analysis of the GenBank/EMBL/DDBJ gymnosperm EST database and the Picea abies and P. glauca genomes and identified 93 putative CLE genes and 11 CLEL genes among eight Pinophyta species, in the genera Cryptomeria, Pinus and Picea. The predicted conifer CLE and CLEL protein sequences had close phylogenetic relationships with their homologues in Arabidopsis. Notably, perfect conservation of the active CLE dodecapeptide in presumed orthologues of the Arabidopsis CLE41/44-TRACHEARY ELEMENT DIFFERENTIATION (TDIF) protein, an inhibitor of tracheary element (xylem) differentiation, was seen in all eight conifer species. We cloned the Pinus radiata CLE41/44-TDIF orthologues. These genes were preferentially expressed in phloem in planta as expected, but unexpectedly, also in differentiating tracheary element (TE) cultures. Surprisingly, transcript abundances of these TE differentiation-inhibitors sharply increased during early TE differentiation, suggesting that some cells differentiate into phloem cells in addition to TEs in these cultures. Applied CLE13 and CLE41/44 peptides inhibited root elongation in Pinus radiata seedlings. We show evidence that two CLEL genes are alternatively spliced via 3′-terminal acceptor exons encoding separate CLEL peptides.
The CLE and CLEL genes are found in conifers and they exhibit at least as much sequence diversity in these species as they do in other plant species. Only one CLE peptide sequence has been 100% conserved between gymnosperms and angiosperms over 300 million years of evolutionary history, the CLE41/44-TDIF peptide and its likely conifer orthologues. The preferential expression of these vascular development-regulating genes in phloem in conifers, as they are in dicot species, suggests close parallels in the regulation of secondary growth and wood formation in gymnosperm and dicot plants. Based on our bioinformatic analysis, we predict a novel mechanism of regulation of the expression of several conifer CLEL peptides, via alternative splicing resulting in the selection of alternative C-terminal exons encoding separate CLEL peptides.
CLE peptide ligands; CLEL peptide ligands; Pinophyta; Conifers; Phylogenetic analysis; Pine tracheary element system
Transcription factors of the basic leucine zipper (bZIP) family control important processes in all eukaryotes. In plants, bZIPs are regulators of many central developmental and physiological processes including photomorphogenesis, leaf and seed formation, energy homeostasis, and abiotic and biotic stress responses. Here we performed a comprehensive phylogenetic analysis of bZIP genes from algae, mosses, ferns, gymnosperms and angiosperms.
We identified 13 groups of bZIP homologues in angiosperms, three more than known before, that represent 34 Possible Groups of Orthologues (PoGOs). The 34 PoGOs may correspond to the complete set of ancestral angiosperm bZIP genes that participated in the diversification of flowering plants. Homologous genes dedicated to seed-related processes and ABA-mediated stress responses originated in the common ancestor of seed plants, and three groups of homologues emerged in the angiosperm lineage, of which one group plays a role in optimizing the use of energy.
Our data suggest that the ancestor of green plants possessed four bZIP genes functionally involved in oxidative stress and unfolded protein responses that are bZIP-mediated processes in all eukaryotes, but also in light-dependent regulations. The four founder genes amplified and diverged significantly, generating traits that benefited the colonization of new environments.
Transcription factors play a fundamental role in plants by orchestrating temporal and spatial gene expression in response to environmental stimuli. Several R2R3-MYB genes of the Arabidopsis subgroup 4 (Sg4) share a C-terminal EAR motif signature recently linked to stress response in angiosperm plants. It is reported here that nearly all Sg4 MYB genes in the conifer trees Picea glauca (white spruce) and Pinus taeda (loblolly pine) form a monophyletic clade (Sg4C) that expanded following the split of gymnosperm and angiosperm lineages. Deeper sequencing in P. glauca identified 10 distinct Sg4C sequences, indicating over-represention of Sg4 sequences compared with angiosperms such as Arabidopsis, Oryza, Vitis, and Populus. The Sg4C MYBs share the EAR motif core. Many of them had stress-responsive transcript profiles after wounding, jasmonic acid (JA) treatment, or exposure to cold in P. glauca and P. taeda, with MYB14 transcripts accumulating most strongly and rapidly. Functional characterization was initiated by expressing the P. taeda MYB14 (PtMYB14) gene in transgenic P. glauca plantlets with a tissue-preferential promoter (cinnamyl alcohol dehydrogenase) and a ubiquitous gene promoter (ubiquitin). Histological, metabolite, and transcript (microarray and targeted quantitiative real-time PCR) analyses of PtMYB14 transgenics, coupled with mechanical wounding and JA application experiments on wild-type plantlets, allowed identification of PtMYB14 as a putative regulator of an isoprenoid-oriented response that leads to the accumulation of sesquiterpene in conifers. Data further suggested that PtMYB14 may contribute to a broad defence response implicating flavonoids. This study also addresses the potential involvement of closely related Sg4C sequences in stress responses and plant evolution.
Gene family expansion; gymnosperms; isoprenoid metabolism; MYB transcription factors; microarray RNA profiling; Picea glauca; plant evolution; stress response; terpenes; tissue-specific expression
Diminishing global fresh water availability has focused research to elucidate mechanisms of water use in poplar, an economically important species. A GT-2 family trihelix transcription factor that is a determinant of water use efficiency (WUE), PtaGTL1 (GT-2 like 1), was identified in Populus tremula × P. alba (clone 717-IB4). Like other GT-2 family members, PtaGTL1 contains both N- and C-terminal trihelix DNA binding domains. PtaGTL1 expression, driven by the Arabidopsis thaliana AtGTL1 promoter, suppressed the higher WUE and drought tolerance phenotypes of an Arabidopsis GTL1 loss-of-function mutation (gtl1-4). Genetic suppression of gtl1-4 was associated with increased stomatal density due to repression of Arabidopsis STOMATAL DENSITY AND DISTRIBUTION1 (AtSDD1), a negative regulator of stomatal development. Electrophoretic mobility shift assays (EMSA) indicated that a PtaGTL1 C-terminal DNA trihelix binding fragment (PtaGTL1-C) interacted with an AtSDD1 promoter fragment containing the GT3 box (GGTAAA), and this GT3 box was necessary for binding. PtaGTL1-C also interacted with a PtaSDD1 promoter fragment via the GT2 box (GGTAAT). PtaSDD1 encodes a protein with 60% primary sequence identity with AtSDD1. In vitro molecular interaction assays were used to determine that Ca2+-loaded calmodulin (CaM) binds to PtaGTL1-C, which was predicted to have a CaM-interaction domain in the first helix of the C-terminal trihelix DNA binding domain. These results indicate that, in Arabidopsis and poplar, GTL1 and SDD1 are fundamental components of stomatal lineage. In addition, PtaGTL1 is a Ca2+-CaM binding protein, which infers a mechanism by which environmental stimuli can induce Ca2+ signatures that would modulate stomatal development and regulate plant water use.
This research aimed to investigate the role of diverse transcription factors (TFs) and to delineate gene regulatory networks directly in conifers at a relatively high-throughput level. The approach integrated sequence analyses, transcript profiling, and development of a conifer-specific activation assay. Transcript accumulation profiles of 102 TFs and potential target genes were clustered to identify groups of coordinately expressed genes. Several different patterns of transcript accumulation were observed by profiling in nine different organs and tissues: 27 genes were preferential to secondary xylem both in stems and roots, and other genes were preferential to phelloderm and periderm or were more ubiquitous. A robust system has been established as a screening approach to define which TFs have the ability to regulate a given promoter in planta. Trans-activation or repression effects were observed in 30% of TF–candidate gene promoter combinations. As a proof of concept, phylogenetic analysis and expression and trans-activation data were used to demonstrate that two spruce NAC-domain proteins most likely play key roles in secondary vascular growth as observed in other plant species. This study tested many TFs from diverse families in a conifer tree species, which broadens the knowledge of promoter–TF interactions in wood development and enables comparisons of gene regulatory networks found in angiosperms and gymnosperms.
Conifer; expression pattern; Picea glauca; secondary cell wall; somatic embryogenesis; trans-activation assay; transcription factor; xylem.
The developmental mechanisms regulating cell differentiation and patterning during the secondary growth of woody tissues are poorly understood. Class III HD ZIP transcription factors are evolutionarily ancient and play fundamental roles in various aspects of plant development. Here we investigate the role of a Class III HD ZIP transcription factor, POPCORONA, during secondary growth of woody stems. Transgenic Populus (poplar) trees expressing either a miRNA-resistant POPCORONA or a synthetic miRNA targeting POPCORONA were used to infer function of POPCORONA during secondary growth. Whole plant, histological, and gene expression changes were compared for transgenic and wild-type control plants. Synthetic miRNA knock down of POPCORONA results in abnormal lignification in cells of the pith, while overexpression of a miRNA-resistant POPCORONA results in delayed lignification of xylem and phloem fibers during secondary growth. POPCORONA misexpression also results in coordinated changes in expression of genes within a previously described transcriptional network regulating cell differentiation and cell wall biosynthesis, and hormone-related genes associated with fiber differentiation. POPCORONA illustrates another function of Class III HD ZIPs: regulating cell differentiation during secondary growth.
The CLE (CLAVATA3/Endosperm Surrounding Region-related) gene family encodes small signaling peptides that are primarily involved in coordinating stem cell fate in different types of plant meristems. Their roles in vascular cambium have highlighted their potential function in wood formation. Apart from recent advances on identification and characterization of CLE genes, little is known about this gene family in a tree species.
Fifty PtCLE genes were identified from the Populus trichocarpa genome and were classified into four major groups based on sequence similarity. Analysis of the genomic organization of PtCLE genes indicates that genome duplication, as well as the diversity in the CLE motif, have contributed to the expansion of CLE gene family in poplar. A comparison with functionally characterized Arabidopsis CLE protein sequences showed that many PtCLE proteins are closely related to their predicted Arabidopsis counterparts. Particularly, PtCLE3, PtCLE12, PtCLE14 and PtCLE38 comprised an identical CLE motif to AtCLE41/TDIF, which is known as a regulator of vascular cambium homeostasis, strongly supporting the idea that similar signaling pathways exist in both species to regulate wood formation and secondary growth. Transcriptome profiling revealed that PtCLE genes generally were differentially expressed while some PtCLE genes exhibited tissue-specific expression patterns. Moreover, compared to their Arabidopsis counterparts, PtCLE genes showed either similar or distinct expression patterns, implying functional conservation in some cases and functional divergence in others.
Our study provides a genome-wide analysis of the CLE gene family in poplar, and highlights the potential roles of key PtCLE genes in the regulation of secondary growth and wood formation. The comparative analysis revealed that functional conservation may exist between PtCLEs and their AtCLE orthologues, which was further supported by transcriptomic analysis. Transcriptional profiling provided further insights into possible functional divergence, evidenced by differential expression patterns of various PtCLE genes.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-016-2504-x) contains supplementary material, which is available to authorized users.
CLE peptide; Populus trichocarpa; Arabidopsis thaliana; Phylogenetic analysis; Transcriptional profiling
Plant LIM domain proteins may act as transcriptional activators of lignin biosynthesis and/or as actin binding and bundling proteins. Plant LIM genes have evolved in phylogenetic subgroups differing in their expression profiles: in the whole plant or specifically in pollen. However, several poplar PtLIM genes belong to uncharacterized monophyletic subgroups and the expression patterns of the LIM gene family in a woody plant have not been studied.
In this work, the expression pattern of the twelve duplicated poplar PtLIM genes has been investigated by semi quantitative RT-PCR in different vegetative and reproductive tissues. As in other plant species, poplar PtLIM genes were widely expressed in the tree or in particular tissues. Especially, PtXLIM1a, PtXLIM1b and PtWLIM1b genes were preferentially expressed in the secondary xylem, suggesting a specific function in wood formation. Moreover, the expression of these genes and of the PtPLIM2a gene was increased in tension wood. Western-blot analysis confirmed the preferential expression of PtXLIM1a protein during xylem differentiation and tension wood formation. Genes classified within the pollen specific PLIM2 and PLIM2-like subgroups were all strongly expressed in pollen but also in cottony hairs. Interestingly, pairs of duplicated PtLIM genes exhibited different expression patterns indicating subfunctionalisations in specific tissues.
The strong expression of several LIM genes in cottony hairs and germinating pollen, as well as in xylem fibers suggests an involvement of plant LIM domain proteins in the control of cell expansion. Comparisons of expression profiles of poplar LIM genes with the published functions of closely related plant LIM genes suggest conserved functions in the areas of lignin biosynthesis, pollen tube growth and mechanical stress response. Based on these results, we propose a novel nomenclature of poplar LIM domain proteins.
Conifers have very large genomes (13 to 30 Gigabases) that are mostly uncharacterized although extensive cDNA resources have recently become available. This report presents a global overview of transcriptome variation in a conifer tree and documents conservation and diversity of gene expression patterns among major vegetative tissues.
An oligonucleotide microarray was developed from Picea glauca and P. sitchensis cDNA datasets. It represents 23,853 unique genes and was shown to be suitable for transcriptome profiling in several species. A comparison of secondary xylem and phelloderm tissues showed that preferential expression in these vascular tissues was highly conserved among Picea spp. RNA-Sequencing strongly confirmed tissue preferential expression and provided a robust validation of the microarray design. A small database of transcription profiles called PiceaGenExpress was developed from over 150 hybridizations spanning eight major tissue types. In total, transcripts were detected for 92% of the genes on the microarray, in at least one tissue. Non-annotated genes were predominantly expressed at low levels in fewer tissues than genes of known or predicted function. Diversity of expression within gene families may be rapidly assessed from PiceaGenExpress. In conifer trees, dehydrins and late embryogenesis abundant (LEA) osmotic regulation proteins occur in large gene families compared to angiosperms. Strong contrasts and low diversity was observed in the dehydrin family, while diverse patterns suggested a greater degree of diversification among LEAs.
Together, the oligonucleotide microarray and the PiceaGenExpress database represent the first resource of this kind for gymnosperm plants. The spruce transcriptome analysis reported here is expected to accelerate genetic studies in the large and important group comprised of conifer trees.
The genus Populus represents one of the most economically important groups of forest trees. It is composed by approximately 30 species used for wood and non-wood products, phytoremediation and biomass. Poplar is subjected to several biological and environmental threats although, compared to annual crops, we know far less about the genetic bases of biotic stress resistance. Woolly poplar aphid (Phloeomyzus passerinii) is considered a main pest of cultivated poplars in European and American countries. In this work we present two high density linkage maps in poplar obtained by a genotyping by sequencing (GBS) approach and the identification of QTLs involved in Ph. passerinii resistance. A total of 5,667 polymorphic markers (5,606 SNPs and 61 SSRs) identified on expressed sequences have been used to genotype 131 plants of an F1 population P ×canadensis obtained by an interspecific mate between Populus deltoides (resistant to woolly poplar aphid) and Populus nigra (susceptible to woolly poplar aphid). The two linkage maps, obtained following the two-way pseudo-testcross mapping strategy, have been used to investigate the genetic bases of woolly poplar aphid resistance. One major QTL and two QTLs with minor effects (mapped on LGV, LGXVI and LG XIX) explaining the 65.8% of the genetic variance observed in the progeny in response to Ph. passerinii attack were found. The high density coverage of functional markers allowed the identification of three genes belonging to disease resistance pathway as putative candidates for P. deltoides resistance to woolly poplar aphid. This work is the first report on genetic of woolly poplar aphid genetic resistance and the resistant loci associated markers identified represent a valuable tool in resistance poplar breeding programs.
Plant Q-type C2H2 zinc finger transcription factors play an important role in plant tolerance to various environmental stresses such as drought, cold, osmotic stress, wounding and mechanical loading. To carry out an improved analysis of the specific role of each member of this subfamily in response to mechanical loading in poplar, we identified 16 two-fingered Q-type C2H2-predicted proteins from the poplar Phytozome database and compared their phylogenetic relationships with 152 two-fingered Q-type C2H2 protein sequences belonging to more than 50 species isolated from the NR protein database of NCBI. Phylogenetic analyses of these Q-type C2H2 proteins sequences classified them into two groups G1 and G2, and conserved motif distributions of interest were established. These two groups differed essentially in their signatures at the C-terminus of their two QALGGH DNA-binding domains. Two additional conserved motifs, MALEAL and LVDCHY, were found only in sequences from Group G1 or from Group G2, respectively. Functional significance of these phylogenetic divergences was assessed by studying transcript accumulation of six poplar C2H2 Q-type genes in responses to abiotic stresses; but no group specificity was found in any organ. Further expression analyses focused on PtaZFP1 and PtaZFP2, the two genes strongly induced by mechanical loading in poplars. The results revealed that these two genes were regulated by several signalling molecules including hydrogen peroxide and the phytohormone jasmonate.
C2H2; phylogenetic analysis; abiotic stress; mechanical loading
Small, secreted signaling peptides work in parallel with phytohormones to control important aspects of plant growth and development. Genes from the C-TERMINALLY ENCODED PEPTIDE (CEP) family produce such peptides which negatively regulate plant growth, especially under stress, and affect other important developmental processes. To illuminate how the CEP gene family has evolved within the plant kingdom, including its emergence, diversification and variation between lineages, a comprehensive survey was undertaken to identify and characterize CEP genes in 106 plant genomes.
Using a motif-based system developed for this study to identify canonical CEP peptide domains, a total of 916 CEP genes and 1,223 CEP domains were found in angiosperms and for the first time in gymnosperms. This defines a narrow band for the emergence of CEP genes in plants, from the divergence of lycophytes to the angiosperm/gymnosperm split. Both CEP genes and domains were found to have diversified in angiosperms, particularly in the Poaceae and Solanaceae plant families. Multispecies orthologous relationships were determined for 22% of identified CEP genes, and further analysis of those groups found selective constraints upon residues within the CEP peptide and within the previously little-characterized variable region. An examination of public Oryza sativa RNA-Seq datasets revealed an expression pattern that links OsCEP5 and OsCEP6 to panicle development and flowering, and CEP gene trees reveal these emerged from a duplication event associated with the Poaceae plant family.
The characterization of the plant-family specific CEP genes OsCEP5 and OsCEP6, the association of CEP genes with angiosperm-specific development processes like panicle development, and the diversification of CEP genes in angiosperms provides further support for the hypothesis that CEP genes have been integral to the evolution of novel traits within the angiosperm lineage. Beyond these findings, the comprehensive set of CEP genes and their properties reported here will be a resource for future research on CEP genes and peptides.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-870) contains supplementary material, which is available to authorized users.
C-terminally encoded peptide; Gene family; Signaling peptides; GC-biased gene conversion; Panicle development; Orthology detection; Angiosperm evolution
Currently, Tectona grandis is one of the most valuable trees in the world and no transcript dataset related to secondary xylem is available. Considering how important the secondary xylem and sapwood transition from young to mature trees is, little is known about the expression differences between those successional processes and which transcription factors could regulate lignin biosynthesis in this tropical tree. Although MYB transcription factors are one of the largest superfamilies in plants related to secondary metabolism, it has not yet been characterized in teak. These results will open new perspectives for studies of diversity, ecology, breeding and genomic programs aiming to understand deeply the biology of this species.
We present a widely expressed gene catalog for T. grandis using Illumina technology and the de novo assembly. A total of 462,260 transcripts were obtained, with 1,502 and 931 genes differentially expressed for stem and branch secondary xylem, respectively, during age transition. Analysis of stem and branch secondary xylem indicates substantial similarity in gene ontologies including carbohydrate enzymes, response to stress, protein binding, and allowed us to find transcription factors and heat-shock proteins differentially expressed. TgMYB1 displays a MYB domain and a predicted coiled-coil (CC) domain, while TgMYB2, TgMYB3 and TgMYB4 showed R2R3-MYB domain and grouped with MYBs from several gymnosperms and flowering plants. TgMYB1, TgMYB4 and TgCES presented higher expression in mature secondary xylem, in contrast with TgMYB2, TgHsp1, TgHsp2, TgHsp3, and TgBi whose expression is higher in young lignified tissues. TgMYB3 is expressed at lower level in secondary xylem.
Expression patterns of MYB transcription factors and heat-shock proteins in lignified tissues are dissimilar when tree development was evaluated, obtaining more expression of TgMYB1 and TgMYB4 in lignified tissues of 60-year-old trees, and more expression in TgHsp1, TgHsp2, TgHsp3 and TgBi in stem secondary xylem of 12-year-old trees. We are opening a door for further functional characterization by reverse genetics and marker-assisted selection with those genes. Investigation of some of the key regulators of lignin biosynthesis in teak, however, could be a valuable step towards understanding how rigidity of teak wood and extractives content are different from most other woods. The obtained transcriptome data represents new sequences of T. grandis deposited in public databases, representing an unprecedented opportunity to discover several related-genes associated with secondary xylem such as transcription factors and stress-related genes in a tropical tree.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-015-0599-x) contains supplementary material, which is available to authorized users.
A novel result of the current research is the development and implementation of a unique functional phylogenomic approach that explores the genomic origins of seed plant diversification. We first use 22,833 sets of orthologs from the nuclear genomes of 101 genera across land plants to reconstruct their phylogenetic relationships. One of the more salient results is the resolution of some enigmatic relationships in seed plant phylogeny, such as the placement of Gnetales as sister to the rest of the gymnosperms. In using this novel phylogenomic approach, we were also able to identify overrepresented functional gene ontology categories in genes that provide positive branch support for major nodes prompting new hypotheses for genes associated with the diversification of angiosperms. For example, RNA interference (RNAi) has played a significant role in the divergence of monocots from other angiosperms, which has experimental support in Arabidopsis and rice. This analysis also implied that the second largest subunit of RNA polymerase IV and V (NRPD2) played a prominent role in the divergence of gymnosperms. This hypothesis is supported by the lack of 24nt siRNA in conifers, the maternal control of small RNA in the seeds of flowering plants, and the emergence of double fertilization in angiosperms. Our approach takes advantage of genomic data to define orthologs, reconstruct relationships, and narrow down candidate genes involved in plant evolution within a phylogenomic view of species' diversification.
Understanding the genetic and genomic basis of plant diversification has been a major goal of evolutionary biologists since Darwin first pondered his “abominable mystery,” the rapid diversification of the angiosperms in the fossil record. We develop and deploy a functional phylogenomic approach that helps identify genes and biological processes putatively involved in species diversification. We assembled a matrix of 22,833 orthologs from 150 species to reconstruct seed plant phylogenetic relationships and to identify gene sets with a unique evolutionary signal. Our analysis of overrepresented biological processes in these sets narrowed down possible genetic mechanisms underlying plant adaptation and diversification. The phylogenetic relationships we uncovered support the hypothesis that gnetophytes are closely related to the rest of the gymnosperms at the base of the living seed plants. We also found that genes involved in post-transcriptional silencing via RNA interference (RNAi)—increasingly important in understanding plant evolution—are significantly represented early in angiosperm and gymnosperm divergence, with an apparent loss of specific classes of small interfering RNAs (siRNA) in gymnosperms. Our functional phylogenomic approach can be applied to any taxa with available sequences to enhance our knowledge of the evolutionary processes underlying biodiversity in general.
Laser capture microdissection (LCM) enables precise dissection and collection of individual cell types from complex tissues. When applied to plant cells, and especially to woody tissues, LCM requires extensive optimization to overcome such factors as rigid cell walls, large central vacuoles, intercellular spaces, and technical issues with thickness and flatness of the sections. Here we present an optimized protocol for the laser-assisted microdissection of developing xylem from mature trees: a gymnosperm (Norway spruce, Picea abies) and an angiosperm (aspen, Populus tremula) tree. Different cell types of spruce and aspen wood (i.e., ray cells, tracheary elements, and fibers) were successfully microdissected from tangential, cross and radial cryosections of the current year’s growth ring. Two approaches were applied to achieve satisfactory flatness and anatomical integrity of the spruce and aspen specimens. The commonly used membrane slides were ineffective as a mounting surface for the wood cryosections. Instead, in the present protocol we use glass slides, and introduce a glass slide sandwich assembly for the preparation of aspen sections. To ascertain that not only the anatomical integrity of the plant tissue, but also the molecular features were not compromised during the whole LCM procedure, good quality total RNA could be extracted from the microdissected cells. This showed the efficiency of the protocol and established that our methodology can be integrated in transcriptome analyses to elucidate cell-specific molecular events regulating wood formation in trees.
cryosection; laser capture microdissection; ray cells; RNA integrity; tracheids; xylem fibers
Background and Aims
The closely related NAC family genes NO APICAL MERISTEM (NAM) and CUP-SHAPED COTYLEDON3 (CUC3) regulate the formation of boundaries within and between plant organs. NAM is post-transcriptionally regulated by miR164, whereas CUC3 is not. To gain insight into the evolution of NAM and CUC3 in the angiosperms, we analysed orthologous genes in early-diverging ANA-grade angiosperms and gymnosperms.
We obtained NAM- and CUC3-like sequences from diverse angiosperms and gymnosperms by a combination of reverse transcriptase PCR, cDNA library screening and database searching, and then investigated their phylogenetic relationships by performing maximum-likelihood reconstructions. We also studied the spatial expression patterns of NAM, CUC3 and MIR164 orthologues in female reproductive tissues of Amborella trichopoda, the probable sister to all other flowering plants.
Separate NAM and CUC3 orthologues were found in early-diverging angiosperms, but not in gymnosperms, which contained putative orthologues of the entire NAM + CUC3 clade that possessed sites of regulation by miR164. Multiple paralogues of NAM or CUC3 genes were noted in certain taxa, including Brassicaceae. Expression of NAM, CUC3 and MIR164 orthologues from Am. trichopoda was found to co-localize in ovules at the developmental boundary between the chalaza and nucellus.
The NAM and CUC3 lineages were generated by duplication, and CUC3 was subsequently lost regulation by miR164, prior to the last common ancestor of the extant angiosperms. However, the paralogous NAM clade genes CUC1 and CUC2 were generated by a more recent duplication, near the base of Brassicaceae. The function of NAM and CUC3 in defining a developmental boundary in the ovule appears to have been conserved since the last common ancestor of the flowering plants, as does the post-transcriptional regulation in ovule tissues of NAM by miR164.
CUP-SHAPED COTYLEDON; CUC; NO APICAL MERISTEM; NAM; NAC; MIR164; Amborella trichopoda; Cabomba aquatica; Ginkgo biloba; angiosperm; gymnosperm
Wood is a major renewable natural resource for the timber, fibre and bioenergy industry. Pinus radiata D. Don is the most important commercial plantation tree species in Australia and several other countries; however, genomic resources for this species are very limited in public databases. Our primary objective was to sequence a large number of expressed sequence tags (ESTs) from genes involved in wood formation in radiata pine.
Six developing xylem cDNA libraries were constructed from earlywood and latewood tissues sampled at juvenile (7 yrs), transition (11 yrs) and mature (30 yrs) ages, respectively. These xylem tissues represent six typical development stages in a rotation period of radiata pine. A total of 6,389 high quality ESTs were collected from 5,952 cDNA clones. Assembly of 5,952 ESTs from 5' end sequences generated 3,304 unigenes including 952 contigs and 2,352 singletons. About 97.0% of the 5,952 ESTs and 96.1% of the unigenes have matches in the UniProt and TIGR databases. Of the 3,174 unigenes with matches, 42.9% were not assigned GO (Gene Ontology) terms and their functions are unknown or unclassified. More than half (52.1%) of the 5,952 ESTs have matches in the Pfam database and represent 772 known protein families. About 18.0% of the 5,952 ESTs matched cell wall related genes in the MAIZEWALL database, representing all 18 categories, 91 of all 174 families and possibly 557 genes. Fifteen cell wall-related genes are ranked in the 30 most abundant genes, including CesA, tubulin, AGP, SAMS, actin, laccase, CCoAMT, MetE, phytocyanin, pectate lyase, cellulase, SuSy, expansin, chitinase and UDP-glucose dehydrogenase. Based on the PlantTFDB database 41 of the 64 transcription factor families in the poplar genome were identified as being involved in radiata pine wood formation. Comparative analysis of GO term abundance revealed a distinct transcriptome in juvenile earlywood formation compared to other stages of wood development.
The first large scale genomic resource in radiata pine was generated from six developing xylem cDNA libraries. Cell wall-related genes and transcription factors were identified. Juvenile earlywood has a distinct transcriptome, which is likely to contribute to the undesirable properties of juvenile wood in radiata pine. The publicly available resource of radiata pine will also be valuable for gene function studies and comparative genomics in forest trees.