Eukaryotic protein kinases belong to a large superfamily with hundreds to thousands of copies and are components of essentially all cellular functions. The goals of this study are to classify protein kinases from 25 plant species and to assess their evolutionary history in conjunction with consideration of their molecular functions. The protein kinase superfamily has expanded in the flowering plant lineage, in part through recent duplications. As a result, the flowering plant protein kinase repertoire, or kinome, is in general significantly larger than other eukaryotes, ranging in size from 600 to 2500 members. This large variation in kinome size is mainly due to the expansion and contraction of a few families, particularly the receptor-like kinase/Pelle family. A number of protein kinases reside in highly conserved, low copy number families and often play broadly conserved regulatory roles in metabolism and cell division, although functions of plant homologues have often diverged from their metazoan counterparts. Members of expanded plant kinase families often have roles in plant-specific processes and some may have contributed to adaptive evolution. Nonetheless, non-adaptive explanations, such as kinase duplicate subfunctionalization and insufficient time for pseudogenization, may also contribute to the large number of seemingly functional protein kinases in plants.
plant protein kinase; gene family evolution; lineage-specific expansion; comparative genomics
Pseudoperonospora cubensis, an obligate oomycete pathogen, is the causal agent of cucurbit downy mildew, a foliar disease of global economic importance. Similar to other oomycete plant pathogens, Ps. cubensis has a suite of RXLR and RXLR-like effector proteins, which likely function as virulence or avirulence determinants during the course of host infection. Using in silico analyses, we identified 271 candidate effector proteins within the Ps. cubensis genome with variable RXLR motifs. In extending this analysis, we present the functional characterization of one Ps. cubensis effector protein, RXLR protein 1 (PscRXLR1), and its closest Phytophthora infestans ortholog, PITG_17484, a member of the Drug/Metabolite Transporter (DMT) superfamily. To assess if such effector-non-effector pairs are common among oomycete plant pathogens, we examined the relationship(s) among putative ortholog pairs in Ps. cubensis and P. infestans. Of 271 predicted Ps. cubensis effector proteins, only 109 (41%) had a putative ortholog in P. infestans and evolutionary rate analysis of these orthologs shows that they are evolving significantly faster than most other genes. We found that PscRXLR1 was up-regulated during the early stages of infection of plants, and, moreover, that heterologous expression of PscRXLR1 in Nicotiana benthamiana elicits a rapid necrosis. More interestingly, we also demonstrate that PscRXLR1 arises as a product of alternative splicing, making this the first example of an alternative splicing event in plant pathogenic oomycetes transforming a non-effector gene to a functional effector protein. Taken together, these data suggest a role for PscRXLR1 in pathogenicity, and, in total, our data provide a basis for comparative analysis of candidate effector proteins and their non-effector orthologs as a means of understanding function and evolutionary history of pathogen effectors.
Understanding the molecular mechanisms of pathogen emergence is central to mitigating the impacts of novel infectious disease agents. The chytrid fungus Batrachochytrium dendrobatidis (Bd) is an emerging pathogen of amphibians that has been implicated in amphibian declines worldwide. Bd is the only member of its clade known to attack vertebrates. However, little is known about the molecular determinants of - or evolutionary transition to - pathogenicity in Bd. Here we sequence the genome of Bd's closest known relative - a non-pathogenic chytrid Homolaphlyctis polyrhiza (Hp). We first describe the genome of Hp, which is comparable to other chytrid genomes in size and number of predicted proteins. We then compare the genomes of Hp, Bd, and 19 additional fungal genomes to identify unique or recent evolutionary elements in the Bd genome. We identified 1,974 Bd-specific genes, a gene set that is enriched for protease, lipase, and microbial effector Gene Ontology terms. We describe significant lineage-specific expansions in three Bd protease families (metallo-, serine-type, and aspartyl proteases). We show that these protease gene family expansions occurred after the divergence of Bd and Hp from their common ancestor and thus are localized to the Bd branch. Finally, we demonstrate that the timing of the protease gene family expansions predates the emergence of Bd as a globally important amphibian pathogen.
The chytrid fungus Batrachochytrium dendrobatidis (Bd) is an emerging pathogen that has been implicated in decimating amphibian populations around the world. Bd is the only member of an ancient group of fungi (called the Chytridiomycota) that is known to attack vertebrates. The question of how an amphibian-killing fungus evolved from non-pathogenic ancestors is vital to protecting the world's remaining amphibians from Bd. We sequenced the genome of Bd's closest known relative - a non-pathogenic chytrid named Homolaphlyctis polyrhiza (Hp). We compared the genomes of Bd, Hp and 18 additional fungi to identify what makes Bd unique. We identified a large number of Bd-specific genes, a gene set that contains a number of possible pathogenicity factors. In particular, we describe a large number of protease genes in the Bd genome and show that these genes were duplicated after the divergence of Bd and Hp from their common ancestor. Studying Bd's pathogenesis in an evolutionary context provides new evidence for the role of protease genes in Bd's ability to kill amphibians.
Solanum commersonii and Solanum tuberosum are closely related plant species that differ in their abilities to cold acclimate; whereas S. commersonii increases in freezing tolerance in response to low temperature, S. tuberosum does not. In Arabidopsis thaliana, cold-regulated genes have been shown to contribute to freezing tolerance, including those that comprise the CBF regulon, genes that are controlled by the CBF transcription factors. The low temperature transcriptomes and CBF regulons of S. commersonii and S. tuberosum were therefore compared to determine whether there might be differences that contribute to their differences in ability to cold acclimate. The results indicated that both plants alter gene expression in response to low temperature to similar degrees with similar kinetics and that both plants have CBF regulons composed of hundreds of genes. However, there were considerable differences in the sets of genes that comprised the low temperature transcriptomes and CBF regulons of the two species. Thus differences in cold regulatory programmes may contribute to the differences in freezing tolerance of these two species. However, 53 groups of putative orthologous genes that are cold-regulated in S. commersonii, S. tuberosum, and A. thaliana were identified. Given that the evolutionary distance between the two Solanum species and A. thaliana is 112–156 million years, it seems likely that these conserved cold-regulated genes—many of which encode transcription factors and proteins of unknown function—have fundamental roles in plant growth and development at low temperature.
Arabidopsis; CBF regulon; freezing tolerance; low temperature transcriptome; Solanum species
Sulfate is an essential nutrient cycled in nature. Ion transporters that specifically facilitate the transport of sulfate across the membranes are found ubiquitously in living organisms. The phylogenetic analysis of known sulfate transporters and their homologous proteins from eukaryotic organisms indicate two evolutionarily distinct groups of sulfate transport systems. One major group named Tribe 1 represents yeast and fungal SUL, plant SULTR, and animal SLC26 families. The evolutionary origin of SULTR family members in land plants and green algae is suggested to be common with yeast and fungal SUL and animal anion exchangers (SLC26). The lineage of plant SULTR family is expanded into four subfamilies (SULTR1–SULTR4) in land plant species. By contrast, the putative SULTR homologs from Chlorophyte green algae are in two separate lineages; one with the subfamily of plant tonoplast-localized sulfate transporters (SULTR4), and the other diverged before the appearance of lineages for SUL, SULTR, and SLC26. There also was a group of yet undefined members of putative sulfate transporters in yeast and fungi divergent from these major lineages in Tribe 1. The other distinct group is Tribe 2, primarily composed of animal sodium-dependent sulfate/carboxylate transporters (SLC13) and plant tonoplast-localized dicarboxylate transporters (TDT). The putative sulfur-sensing protein (SAC1) and SAC1-like transporters (SLT) of Chlorophyte green algae, bryophyte, and lycophyte show low degrees of sequence similarities with SLC13 and TDT. However, the phylogenetic relationship between SAC1/SLT and the other two families, SLC13 and TDT in Tribe 2, is not clearly supported. In addition, the SAC1/SLT family is absent in the angiosperm species analyzed. The present study suggests distinct evolutionary trajectories of sulfate transport systems for land plants and green algae.
evolution; plant; sulfate; transporter
In plants and animals innate immunity is the first line of defence against attack by microbial pathogens. Specific molecular features of bacteria and fungi are recognised by pattern recognition receptors that have extracellular domains containing leucine rich repeats. Recognition of microbes by these receptors induces defence responses that protect hosts against potential microbial attack.
A survey of genome sequences from 101 species, representing a broad cross-section of the eukaryotic phylogenetic tree, reveals an absence of leucine rich repeat-domain containing receptors in the fungal kingdom. Uniquely, however, fungi possess adenylate cyclases that contain distinct leucine rich repeat-domains, which have been demonstrated to act as an alternative means of perceiving the presence of bacteria by at least one fungal species. Interestingly, the morphologically similar osmotrophic oomycetes, which are taxonomically distant members of the stramenopiles, possess pattern recognition receptors with similar domain structures to those found in plants.
The absence of pattern recognition receptors suggests that fungi may possess novel classes of pattern-recognition receptor, such as the modified adenylate cyclase, or instead rely on secretion of anti-microbial secondary metabolites for protection from microbial attack. The absence of pattern recognition receptors in fungi, coupled with their abundance in oomycetes, suggests this may be a unique characteristic of the fungal kingdom rather than a consequence of the osmotrophic growth form.
RBR ubiquitin ligases are components of the ubiquitin-proteasome system present in all eukaryotes. They are characterized by having the RBR (RING – IBR – RING) supradomain. In this study, the patterns of emergence of RBR genes in plants are described.
Phylogenetic and structural data confirm that just four RBR subfamilies (Ariadne, ARA54, Plant I/Helicase and Plant II) exist in viridiplantae. All of them originated before the split that separated green algae from the rest of plants. Multiple genes of two of these subfamilies (Ariadne and Plant II) appeared in early plant evolution. It is deduced that the common ancestor of all plants contained at least five RBR genes and the available data suggest that this number has been increasing slowly along streptophyta evolution, although losses, especially of Helicase RBR genes, have also occurred in several lineages. Some higher plants (e. g. Arabidopsis thaliana, Oryza sativa) contain a very large number of RBR genes and many of them were recently generated by tandem duplications. Microarray data indicate that most of these new genes have low-level and sometimes specific expression patterns. On the contrary, and as occurs in animals, a small set of older genes are broadly expressed at higher levels.
The available data suggests that the dynamics of appearance and conservation of RBR genes is quite different in plants from what has been described in animals. In animals, an abrupt emergence of many structurally diverse RBR subfamilies in early animal history, followed by losses of multiple genes in particular lineages, occurred. These patterns are not observed in plants. It is also shown that while both plants and animals contain a small, similar set of essential RBR genes, the rest evolves differently. The functional implications of these results are discussed.
The availability of genome and transcriptome sequences for a number of species permits the identification and characterization of conserved as well as divergent genes such as lineage-specific genes which have no detectable sequence similarity to genes from other lineages. While genes conserved among taxa provide insight into the core processes among species, lineage-specific genes provide insights into evolutionary processes and biological functions that are likely clade or species specific.
Comparative analyses using the Arabidopsis thaliana genome and sequences from 178 other species within the Plant Kingdom enabled the identification of 24,624 A. thaliana genes (91.7%) that were termed Evolutionary Conserved (EC) as defined by sequence similarity to a database entry as well as two sets of lineage-specific genes within A. thaliana. One of the A. thaliana lineage-specific gene sets share sequence similarity only to sequences from species within the Brassicaceae family and are termed Conserved Brassicaceae-Specific Genes (914, 3.4%, CBSG). The other set of A. thaliana lineage-specific genes, the Arabidopsis Lineage-Specific Genes (1,324, 4.9%, ALSG), lack sequence similarity to any sequence outside A. thaliana. While many CBSGs (76.7%) and ALSGs (52.9%) are transcribed, the majority of the CBSGs (76.1%) and ALSGs (94.4%) have no annotated function. Co-expression analysis indicated significant enrichment of the CBSGs and ALSGs in multiple functional categories suggesting their involvement in a wide range of biological functions. Subcellular localization prediction revealed that the CBSGs were significantly enriched in proteins targeted to the secretory pathway (412, 45.1%). Among the 107 putatively secreted CBSGs with known functions, 67 encode a putative pollen coat protein or cysteine-rich protein with sequence similarity to the S-locus cysteine-rich protein that is the pollen determinant controlling allele specific pollen rejection in self-incompatible Brassicaceae species. Overall, the ALSGs and CBSGs were more highly methylated in floral tissue compared to the ECs. Single Nucleotide Polymorphism (SNP) analysis showed an elevated ratio of non-synonymous to synonymous SNPs within the ALSGs (1.99) and CBSGs (1.65) relative to the EC set (0.92), mainly caused by an elevated number of non-synonymous SNPs, indicating that they are fast-evolving at the protein sequence level.
Our analyses suggest that while a significant fraction of the A. thaliana proteome is conserved within the Plant Kingdom, evolutionarily distinct sets of genes that may function in defining biological processes unique to these lineages have arisen within the Brassicaceae and A. thaliana.
Due to the selection pressure imposed by highly variable environmental conditions, stress sensing and regulatory response mechanisms in plants are expected to evolve rapidly. One potential source of innovation in plant stress response mechanisms is gene duplication. In this study, we examined the evolution of stress-regulated gene expression among duplicated genes in the model plant Arabidopsis thaliana. Key to this analysis was reconstructing the putative ancestral stress regulation pattern. By comparing the expression patterns of duplicated genes with the patterns of their ancestors, duplicated genes likely lost and gained stress responses at a rapid rate initially, but the rate is close to zero when the synonymous substitution rate (a proxy for time) is >∼0.8. When considering duplicated gene pairs, we found that partitioning of putative ancestral stress responses occurred more frequently compared to cases of parallel retention and loss. Furthermore, the pattern of stress response partitioning was extremely asymmetric. An analysis of putative cis-acting DNA regulatory elements in the promoters of the duplicated stress-regulated genes indicated that the asymmetric partitioning of ancestral stress responses are likely due, at least in part, to differential loss of DNA regulatory elements; the duplicated genes losing most of their stress responses were those that had lost more of the putative cis-acting elements. Finally, duplicate genes that lost most or all of the ancestral responses are more likely to have gained responses to other stresses. Therefore, the retention of duplicates that inherit few or no functions seems to be coupled to neofunctionalization. Taken together, our findings provide new insight into the patterns of evolutionary changes in gene stress responses after duplication and lay the foundation for testing the adaptive significance of stress regulatory changes under highly variable biotic and abiotic environments.
Plants have developed a multitude of response mechanisms to survive stressful environments. Since the environment is highly variable, these stress response mechanisms are expected to undergo frequent innovation. Duplicate genes represent a potential source for such innovation. In this paper, we explored the evolutionary changes in stress responses at the transcriptional level among duplicated genes in the model plant Arabidopsis thaliana. We found that after gene duplication, ancestral stress responses tend to be retained by only one of the gene duplicates (partitioning). In addition, the pattern of partitioning of multiple stress responses is extremely asymmetric, where one duplicate tends to inherit most or all of the ancestral stress responses. We present evidence that the asymmetric loss of stress responses is correlated with the asymmetric loss of putative transcription factor binding sites. Interestingly, those duplicate genes inheriting few or no ancestral responses tend to have gained new stress responses, providing support for the model that gene duplicates are a source of innovation. Our findings provide important insight into the mechanisms of gene function evolution and lay the foundation for experimental studies to determine the significance of gain of stress responses in plant adaptation.
The structure and function of a protein is dependent on coordinated interactions between its residues. The selective pressures associated with a mutation at one site should therefore depend on the amino acid identity of interacting sites. Mutual information has previously been applied to multiple sequence alignments as a means of detecting coevolutionary interactions. Here, we introduce a refinement of the mutual information method that: 1) removes a significant, non-coevolutionary bias and 2) accounts for heteroscedasticity. Using a large, non-overlapping database of protein alignments, we demonstrate that predicted coevolving residue-pairs tend to lie in close physical proximity. We introduce coevolution potentials as a novel measure of the propensity for the 20 amino acids to pair amongst predicted coevolutionary interactions. Ionic, hydrogen, and disulfide bond-forming pairs exhibited the highest potentials. Finally, we demonstrate that pairs of catalytic residues have a significantly increased likelihood to be identified as coevolving. These correlations to distinct protein features verify the accuracy of our algorithm and are consistent with a model of coevolution in which selective pressures towards preserving residue interactions act to shape the mutational landscape of a protein by restricting the set of admissible neutral mutations.
Despite the emerging experimental techniques for perturbing multiple genes and measuring their quantitative phenotypic effects, genetic interactions have remained extremely difficult to predict on a large scale. Using a recent high-resolution screen of genetic interactions in yeast as a case study, we investigated whether the extraction of pertinent information encoded in the quantitative phenotypic measurements could be improved by computational means. By taking advantage of the observation that most gene pairs in the genetic interaction screens have no significant interactions with each other, we developed a sequential approximation procedure which ranks the mutation pairs in order of evidence for a genetic interaction. The sequential approximations can efficiently remove background variation in the double-mutation screens and give increasingly accurate estimates of the single-mutant fitness measurements. Interestingly, these estimates not only provide predictions for genetic interactions which are consistent with those obtained using the measured fitness, but they can even significantly improve the accuracy with which one can distinguish functionally-related gene pairs from the non-interacting pairs. The computational approach, in general, enables an efficient exploration and classification of genetic interactions in other studies and systems as well.
Two-component systems are an evolutionarily ancient means for signal transduction. These systems are comprised of a number of distinct elements, namely histidine kinases, response regulators, and in the case of multi-step phosphorelays, histidine-containing phosphotransfer proteins (HPts). Arabidopsis makes use of a two-component signaling system to mediate the response to the plant hormone cytokinin. Two-component signaling elements have also been implicated in plant responses to ethylene, abiotic stresses, and red light, and in regulating various aspects of plant growth and development. Here we present an overview of the two-component signaling elements found in Arabidopsis, including functional and phylogenetic information on both bona-fide and divergent elements.
The ability to respond to natural selection under novel conditions is critical for the establishment and persistence of introduced alien species and their ability to become invasive. Here we correlated neutral and quantitative genetic diversity of the weed Pennisetum setaceum Forsk. Chiov. (Poaceae) with differing global (North American and African) patterns of invasiveness and compared this diversity to native range populations. Numerous molecular markers indicate complete monoclonality within and among all of these areas (FST = 0.0) and is supported by extreme low quantitative trait variance (QST = 0.00065–0.00952). The results support the general-purpose-genotype hypothesis that can tolerate all environmental variation. However, a single global genotype and widespread invasiveness under numerous environmental conditions suggests a super-genotype. The super-genotype described here likely evolved high levels of plasticity in response to fluctuating environmental conditions during the Early to Mid Holocene. During the Late Holocene, when environmental conditions were predominantly constant but extremely inclement, strong selection resulted in only a few surviving genotypes.
A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level.
Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species.
A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination.
Analysis of Arabidopsis and rice polygalacturonases suggests that polygalacturonases duplicates underwent rapid expression divergence and that the mechanisms of duplication affect the divergence rate.
Polygalacturonases (PGs) belong to a large gene family in plants and are believed to be responsible for various cell separation processes. PG activities have been shown to be associated with a wide range of plant developmental programs such as seed germination, organ abscission, pod and anther dehiscence, pollen grain maturation, fruit softening and decay, xylem cell formation, and pollen tube growth, thus illustrating divergent roles for members of this gene family. A close look at phylogenetic relationships among Arabidopsis and rice PGs accompanied by analysis of expression data provides an opportunity to address key questions on the evolution and functions of duplicate genes.
We found that both tandem and whole-genome duplications contribute significantly to the expansion of this gene family but are associated with substantial gene losses. In addition, there are at least 21 PGs in the common ancestor of Arabidopsis and rice. We have also determined the relationships between Arabidopsis and rice PGs and their expression patterns in Arabidopsis to provide insights into the functional divergence between members of this gene family. By evaluating expression in five Arabidopsis tissues and during five stages of abscission, we found overlapping but distinct expression patterns for most of the different PGs.
Expression data suggest specialized roles or subfunctionalization for each PG gene member. PGs derived from whole genome duplication tend to have more similar expression patterns than those derived from tandem duplications. Our findings suggest that PG duplicates underwent rapid expression divergence and that the mechanisms of duplication affect the divergence rate.
Reactive oxygen species (ROS) are produced in plant cells in response to diverse biotic and abiotic stresses as well as during normal growth and development. Although a large number of transcription factor (TF) genes are up- or down-regulated by ROS, currently very little is known about the functions of these TFs during oxidative stress. In this work, we examined the role of ERF6 (ETHYLENE RESPONSE FACTOR6), an AP2/ERF domain-containing TF, during oxidative stress responses in Arabidopsis. Mutant analyses showed that NADPH oxidase (RbohD) and calcium signaling are required for ROS-responsive expression of ERF6. erf6 insertion mutant plants showed reduced growth and increased H2O2 and anthocyanin levels. Expression analyses of selected ROS-responsive genes during oxidative stress identified several differentially expressed genes in the erf6 mutant. In particular, a number of ROS responsive genes, such as ZAT12, HSFs, WRKYs, MAPKs, RBOHs, DHAR1, APX4, and CAT1 were more strongly induced by H2O2 in erf6 plants than in wild-type. In contrast, MDAR3, CAT3, VTC2 and EX1 showed reduced expression levels in the erf6 mutant. Taken together, our results indicate that ERF6 plays an important role as a positive antioxidant regulator during plant growth and in response to biotic and abiotic stresses.
Alternative splicing plays a major role in expanding the potential informational content of eukaryotic genomes. It is an important post-transcriptional regulatory mechanism that can increase protein diversity and affect mRNA stability. Alternative splicing is often regulated in a tissue-specific and stress-responsive manner. Cold stress, which adversely affects plant growth and development, regulates the transcription and splicing of plant splicing factors. This can affect the pre-mRNA processing of many genes. To identify cold regulated alternative splicing we applied Affymetrix Arabidopsis tiling arrays to survey the transcriptome under cold treatment conditions. A novel algorithm was used for detection of statistically relevant changes in intron expression within a transcript between control and cold growth conditions. A reverse transcription polymerase chain reaction (RT-PCR) analysis of a number of randomly selected genes confirmed the changes in splicing patterns under cold stress predicted by tiling array. Our analysis revealed new types of cold responsive genes. While their expression level remains relatively unchanged under cold stress their splicing pattern shows detectable changes in the relative abundance of isoforms. The majority of cold regulated alternative splicing introduced a premature termination codon (PTC) into the transcripts creating potential targets for degradation by the nonsense mediated mRNA decay (NMD) process. A number of these genes were analyzed in NMD-defective mutants by RT-PCR and shown to evade NMD. This may result in new and truncated proteins with altered functions or dominant negative effects. The results indicate that cold affects both quantitative and qualitative aspects of gene expression.
Leymus chinensis (Trin.) Tzvel. is a high saline-alkaline tolerant forage grass genus of the tribe Gramineae family, which also plays an important role in protection of natural environment. To date, little is known about the saline-alkaline tolerance of L. chinensis on the molecular level. To better understand the molecular mechanism of saline-alkaline tolerance in L. chinensis, 454 pyrosequencing was used for the transcriptome study.
We used Roche-454 massive parallel pyrosequencing technology to sequence two different cDNA libraries that were built from the two samples of control and under saline-alkaline treatment (optimal stress concentration-Hoagland solution with 100 mM NaCl and 200 mM NaHCO3). A total of 363,734 reads in control group and 526,267 reads in treatment group with an average length of 489 bp and 493 bp were obtained, respectively. The reads were assembled into 104,105 unigenes with MIRA sequence assemable software, among which, 73,665 unigenes were in control group, 88,016 unigenes in treatment group and 57,576 unigenes in both groups. According to the comparative expression analysis between the two groups with the threshold of “log2 Ratio ≥1”, there were 36,497 up-regulated unegenes and 18,218 down-regulated unigenes predicted to be the differentially expressed genes. After gene annotation and pathway enrichment analysis, most of them were involved in stress and tolerant function, signal transduction, energy production and conversion, and inorganic ion transport. Furthermore, 16 of these differentially expressed genes were selected for real-time PCR validation, and they were successfully confirmed with the results of 454 pyrosequencing.
This work is the first time to study the transcriptome of L. chinensis under saline-alkaline treatment based on the 454-FLX massively parallel DNA sequencing platform. It also deepened studies on molecular mechanisms of saline-alkaline in L. chinensis, and constituted a database for future studies.
The availability of complete genome sequence of soybean has allowed research community to design the 66 K Affymetrix Soybean Array GeneChip for genome-wide expression profiling of soybean. In this study, we carried out microarray analysis of leaf tissues of soybean plants, which were subjected to drought stress from late vegetative V6 and from full bloom reproductive R2 stages. Our data analyses showed that out of 46093 soybean genes, which were predicted with high confidence among approximately 66000 putative genes, 41059 genes could be assigned with a known function. Using the criteria of a ratio change > = 2 and a q-value<0.05, we identified 1458 and 1818 upregulated and 1582 and 1688 downregulated genes in drought-stressed V6 and R2 leaves, respectively. These datasets were classified into 19 most abundant biological categories with similar proportions. There were only 612 and 463 genes that were overlapped among the upregulated and downregulated genes, respectively, in both stages, suggesting that both conserved and unconserved pathways might be involved in regulation of drought response in different stages of plant development. A comparative expression analysis using our datasets and that of drought stressed Arabidopsis leaves revealed the existence of both conserved and species-specific mechanisms that regulate drought responses. Many upregulated genes encode either regulatory proteins, such as transcription factors, including those with high homology to Arabidopsis DREB, NAC, AREB and ZAT/STZ transcription factors, kinases and two-component system members, or functional proteins, e.g. late embryogenesis-abundant proteins, glycosyltransferases, glycoside hydrolases, defensins and glyoxalase I family proteins. A detailed analysis of the GmNAC family and the hormone-related gene category showed that expression of many GmNAC and hormone-related genes was altered by drought in V6 and/or R2 leaves. Additionally, the downregulation of many photosynthesis-related genes, which contribute to growth retardation under drought stress, may serve as an adaptive mechanism for plant survival. This study has identified excellent drought-responsive candidate genes for in-depth characterization and future development of improved drought-tolerant transgenic soybeans.
The R2R3MYB proteins comprise one of the largest families of transcription factors in plants. Although genome-wide analysis of this family has been carried out in some species, little is known about R2R3MYB genes in cucumber (Cucumis sativus L.).
This study has identified 55 R2R3MYB genes in the latest cucumber genome and the CsR2R3MYB family contained the smallest number of identified genes compared to other species that have been studied due to the absence of recent gene duplication events. These results were also supported by genome distribution and gene duplication analysis. Phylogenetic analysis showed that they could be classified into 11 subgroups. The evolutionary relationships and the intron - exon organizations that showed similarities with Arabidopsis, Vitis and Glycine R2R3MYB proteins were also analyzed and suggested strong gene conservation but also the expansions of particular functional genes during the evolution of the plant species. In addition, we found that 8 out of 55 (∼14.54%) cucumber R2R3MYB genes underwent alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Tissue-specific expression profiles showed that 50 cucumber R2R3MYB genes were expressed in at least one of the tissues and the other 5 genes showed very low expression in all tissues tested, which suggested that cucumber R2R3MYB genes took part in many cellular processes. The transcript abundance level analysis during abiotic conditions (NaCl, ABA and low temperature treatments) identified a group of R2R3MYB genes that responded to one or more treatments.
This study has produced a comparative genomics analysis of the cucumber R2R3MYB gene family and has provided the first steps towards the selection of CsR2R3MYB genes for cloning and functional dissection that can be used in further studies to uncover their roles in cucumber growth and development.
Rice is sensitive to chilling stress, especially at the seedling stage. To elucidate the molecular genetic mechanisms of chilling tolerance in rice, comprehensive gene expressions of two rice genotypes (chilling-tolerant LTH and chilling-sensitive IR29) with contrasting responses to chilling stress were comparatively analyzed. Results revealed a differential constitutive gene expression prior to stress and distinct global transcription reprogramming between the two rice genotypes under time-series chilling stress and subsequent recovery conditions. A set of genes with higher basal expression were identified in chilling-tolerant LTH compared with chilling-sensitive IR29, indicating their possible role in intrinsic tolerance to chilling stress. Under chilling stress, the major effect on gene expression was up-regulation in the chilling- tolerant genotype and strong repression in chilling-sensitive genotype. Early responses to chilling stress in both genotypes featured commonly up-regulated genes related to transcription regulation and signal transduction, while functional categories for late phase chilling regulated genes were diverse with a wide range of functional adaptations to continuous stress. Following the cessation of chilling treatments, there was quick and efficient reversion of gene expression in the chilling-tolerant genotype, while the chilling-sensitive genotype displayed considerably slower recovering capacity at the transcriptional level. In addition, the detection of differentially-regulated TF genes and enriched cis-elements demonstrated that multiple regulatory pathways, including CBF and MYBS3 regulons, were involved in chilling stress tolerance. A number of the chilling-regulated genes identified in this study were co-localized onto previously fine-mapped cold-tolerance-related QTLs, providing candidates for gene cloning and elucidation of molecular mechanisms responsible for chilling tolerance in rice.
Expression divergence is thought to be a hallmark of functional diversification between homologs post duplication. Modification in regulatory elements has been invoked to explain expression divergence after duplication for several MADS-box genes, however, verification of reciprocal loss of cis-regulatory elements is lacking in plants. Here, we report that the evolution of MPF2-like genes has entailed degenerative mutations in a core promoter CArG-box and an auxin response factor (ARF) binding element in the large 1st intron in the coding region. Previously, MPF2-like genes were duplicated into MPF2-like-A and -B through genome duplication in Withania and Tubocapsicum (Withaninae). The calyx of Withania grows exorbitantly after pollination unlike Tubocapsicum, where it degenerates. Besides inflated calyx syndrome formation, MPF2-like transcription factors are implicated in functions both during the vegetative and reproductive development as well as in phase transition. MPF2-like-A of Withania (WSA206) is strongly expressed in sepals, while MPF2-like-B (WSB206) is not. Interestingly, their combined expression patterns seem to replicate the pattern of their closely related hypothetical progenitors from Vassobia and Physalis. Using phylogenetic shadowing, site-directed mutagenesis and motif swapping, we could show that the loss of a conserved CArG-box in MPF2-like-B of Withania is responsible for impeding its expression in sepals. Conversely, loss of an ARE in MPF2-like-A relaxed the constraint on expression in sepals. Thus, the ARE is an active suppressor of MPF2-like gene expression in sepals, which in contrast is activated via the CArG-box. The observed expression divergence in MPF2-like genes due to reciprocal loss of cis-regulatory elements has added to genetic and phenotypic variations in the Withaninae and enhanced the potential of natural selection for the adaptive evolution of ICS. Moreover, these results provide insight into the interplay of floral developmental and hormonal pathways during ICS development and add to the understanding of the importance of polyploidy in plants.
Ligating adapters with unique synthetic oligonucleotide sequences (sequence tags) onto individual DNA samples before massively parallel sequencing is a popular and efficient way to obtain sequence data from many individual samples. Tag sequences should be numerous and sufficiently different to ensure sequencing, replication, and oligonucleotide synthesis errors do not cause tags to be unrecoverable or confused. However, many design approaches only protect against substitution errors during sequencing and extant tag sets contain too few tag sequences. We developed an open-source software package to validate sequence tags for conformance to two distance metrics and design sequence tags robust to indel and substitution errors. We use this software package to evaluate several commercial and non-commercial sequence tag sets, design several large sets (maxcount = 7,198) of edit metric sequence tags having different lengths and degrees of error correction, and integrate a subset of these edit metric tags to polymerase chain reaction (PCR) primers and sequencing adapters. We validate a subset of these edit metric tagged PCR primers and sequencing adapters by sequencing on several platforms and subsequent comparison to commercially available alternatives. We find that several commonly used sets of sequence tags or design methodologies used to produce sequence tags do not meet the minimum expectations of their underlying distance metric, and we find that PCR primers and sequencing adapters incorporating edit metric sequence tags designed by our software package perform as well as their commercial counterparts. We suggest that researchers evaluate sequence tags prior to use or evaluate tags that they have been using. The sequence tag sets we design improve on extant sets because they are large, valid across the set, and robust to the suite of substitution, insertion, and deletion errors affecting massively parallel sequencing workflows on all currently used platforms.
Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM). Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR) sequences), it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels), and single nucleotide polymorphisms (SNPs) observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker’s sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within-locus polymorphism is not known.