The MYB gene family comprises one of the richest groups of transcription factors in plants. Plant MYB proteins are characterized by a highly conserved MYB DNA-binding domain. MYB proteins are classified into four major groups namely, 1R-MYB, 2R-MYB, 3R-MYB and 4R-MYB based on the number and position of MYB repeats. MYB transcription factors are involved in plant development, secondary metabolism, hormone signal transduction, disease resistance and abiotic stress tolerance. A comparative analysis of MYB family genes in rice and Arabidopsis will help reveal the evolution and function of MYB genes in plants.
A genome-wide analysis identified at least 155 and 197 MYB genes in rice and Arabidopsis, respectively. Gene structure analysis revealed that MYB family genes possess relatively more number of introns in the middle as compared with C- and N-terminal regions of the predicted genes. Intronless MYB-genes are highly conserved both in rice and Arabidopsis. MYB genes encoding R2R3 repeat MYB proteins retained conserved gene structure with three exons and two introns, whereas genes encoding R1R2R3 repeat containing proteins consist of six exons and five introns. The splicing pattern is similar among R1R2R3 MYB genes in Arabidopsis. In contrast, variation in splicing pattern was observed among R1R2R3 MYB members of rice. Consensus motif analysis of 1kb upstream region (5′ to translation initiation codon) of MYB gene ORFs led to the identification of conserved and over-represented cis-motifs in both rice and Arabidopsis. Real-time quantitative RT-PCR analysis showed that several members of MYBs are up-regulated by various abiotic stresses both in rice and Arabidopsis.
A comprehensive genome-wide analysis of chromosomal distribution, tandem repeats and phylogenetic relationship of MYB family genes in rice and Arabidopsis suggested their evolution via duplication. Genome-wide comparative analysis of MYB genes and their expression analysis identified several MYBs with potential role in development and stress response of plants.
Finding where transcription factors (TFs) bind to the DNA is of key importance to decipher gene regulation at a transcriptional level. Classically, computational prediction of TF binding sites (TFBSs) is based on basic position weight matrices (PWMs) which quantitatively score binding motifs based on the observed nucleotide patterns in a set of TFBSs for the corresponding TF. Such models make the strong assumption that each nucleotide participates independently in the corresponding DNA-protein interaction and do not account for flexible length motifs. We introduce transcription factor flexible models (TFFMs) to represent TF binding properties. Based on hidden Markov models, TFFMs are flexible, and can model both position interdependence within TFBSs and variable length motifs within a single dedicated framework. The availability of thousands of experimentally validated DNA-TF interaction sequences from ChIP-seq allows for the generation of models that perform as well as PWMs for stereotypical TFs and can improve performance for TFs with flexible binding characteristics. We present a new graphical representation of the motifs that convey properties of position interdependence. TFFMs have been assessed on ChIP-seq data sets coming from the ENCODE project, revealing that they can perform better than both PWMs and the dinucleotide weight matrix extension in discriminating ChIP-seq from background sequences. Under the assumption that ChIP-seq signal values are correlated with the affinity of the TF-DNA binding, we find that TFFM scores correlate with ChIP-seq peak signals. Moreover, using available TF-DNA affinity measurements for the Max TF, we demonstrate that TFFMs constructed from ChIP-seq data correlate with published experimentally measured DNA-binding affinities. Finally, TFFMs allow for the straightforward computation of an integrated TF occupancy score across a sequence. These results demonstrate the capacity of TFFMs to accurately model DNA-protein interactions, while providing a single unified framework suitable for the next generation of TFBS prediction.
Transcription factors are critical proteins for sequence-specific control of transcriptional regulation. Finding where these proteins bind to DNA is of key importance for global efforts to decipher the complex mechanisms of gene regulation. Greater understanding of the regulation of transcription promises to improve human genetic analysis by specifying critical gene components that have eluded investigators. Classically, computational prediction of transcription factor binding sites (TFBS) is based on models giving weights to each nucleotide at each position. We introduce a novel statistical model for the prediction of TFBS tolerant of a broader range of TFBS configurations than can be conveniently accommodated by existing methods. The new models are designed to address the confounding properties of nucleotide composition, inter-positional sequence dependence and variable lengths (e.g. variable spacing between half-sites) observed in the more comprehensive experimental data now emerging. The new models generate scores consistent with DNA-protein affinities measured experimentally and can be represented graphically, retaining desirable attributes of past methods. It demonstrates the capacity of the new approach to accurately assess DNA-protein interactions. With the rich experimental data generated from chromatin immunoprecipitation experiments, a greater diversity of TFBS properties has emerged that can now be accommodated within a single predictive approach.
The transcription regulatory properties of murine B-myb protein were compared to those of c-myb. Whereas c-Myb trans-activated an SV40 early promoter containing multiple copies of an upstream c-Myb DNA-binding site (MBS-1), and similarly the human c-myc promoter, B-Myb was unable to do so. Full-length B-Myb translated in vitro did not bind MBS-1; however, truncation of the B-Myb C-terminus or fusion of the B-Myb DNA-binding domain to the c-Myb C-terminus showed that it was inherently competent to interact with this motif. Further evidence from co-transfection experiments, demonstrating that B-Myb inhibited trans-activation by c-Myb, suggested that failure of B-Myb to trans-activate these promoters did not simply occur through lack of binding to MBS-1. Moreover, using GAL4/B-Myb fusions, it was found that an acidic region of B-Myb, which by comparison to c-Myb was expected to contain a transcription activation domain, actually had no inherent trans-activation activity and indeed appeared to trans-inhibit c-Myb. In contrast to the above findings, both B-Myb and c-Myb were able to weakly trans-activate the DNA polymerase alpha promoter. Results obtained here demonstrate that the activities of B-Myb and c-Myb are clearly distinct and suggest that these related proteins may have different functions in regulation of target gene expression.
The RAG-2 gene encodes a component of the V(D)J recombinase which is essential for the assembly of antigen receptor genes in B and T lymphocytes. Previously, we reported that the transcription factor BSAP (PAX-5) regulates the murine RAG-2 promoter in B-cell lines. A partially overlapping but distinct region of the proximal RAG-2 promoter was also identified as an important element for promoter activity in T cells; however, the responsible factor was unknown. In this report, we present data demonstrating that c-Myb binds to a Myb consensus site within the proximal promoter and is critical for its activity in T-lineage cells. We show that c-Myb can transactivate a RAG-2 promoter-reporter construct in cotransfection assays and that this transactivation depends on the proximal promoter Myb consensus site. By using a chromatin immunoprecipitation (ChIP) strategy, fractionation of chromatin with anti-c-Myb antibody specifically enriched endogenous RAG-2 promoter DNA sequences. DNase I genomic footprinting revealed that the c-Myb site is occupied in a tissue-specific fashion in vivo. Furthermore, an integrated RAG-2 promoter construct with mutations at the c-Myb site was not enriched in the ChIP assay, while a wild-type integrated promoter construct was enriched. Finally, this lack of binding of c-Myb to a chromosomally integrated mutant RAG-2 promoter construct in vivo was associated with a striking decrease in promoter activity. We conclude that c-Myb regulates the RAG-2 promoter in T cells by binding to this consensus c-Myb binding site.
Redundancy and competition between R2R3-MYB activators and repressors on common target genes has been proposed as a fine-tuning mechanism for the regulation of plant secondary metabolism. This hypothesis was tested in white spruce [Picea glauca (Moench) Voss] by investigating the effects of R2R3-MYBs from different subgroups on common targets from distinct metabolic pathways. Comparative analysis of transcript profiling data in spruces overexpressing R2R3-MYBs from loblolly pine (Pinus taeda L.), PtMYB1, PtMYB8, and PtMYB14, defined a set of common genes that display opposite regulation effects. The relationship between the closest MYB homologues and 33 putative target genes was explored by quantitative PCR expression profiling in wild-type P. glauca plants during the diurnal cycle. Significant Spearman’s correlation estimates were consistent with the proposed opposite effect of different R2R3-MYBs on several putative target genes in a time-related and tissue-preferential manner. Expression of sequences coding for 4CL, DHS2, COMT1, SHM4, and a lipase thio/esterase positively correlated with that of PgMYB1 and PgMYB8, but negatively with that of PgMYB14 and PgMYB15. Complementary electrophoretic mobility shift assay (EMSA) and transactivation assay provided experimental evidence that these different R2R3-MYBs are able to bind similar AC cis-elements in the promoter region of Pg4CL and PgDHS2 genes but have opposite effects on their expression. Competitive binding EMSA experiments showed that PgMYB8 competes more strongly than PgMYB15 for the AC-I MYB binding site in the Pg4CL promoter. Together, the results bring a new perspective to the action of R2R3-MYB proteins in the regulation of distinct but interconnecting metabolism pathways.
Conifers; phenylpropanoid pathway; protein–DNA binding; R2R3-MYB evolution; transcriptional network.
Iron-inducible transcription of the ap65-1 gene in Trichomonas vaginalis involves at least three Myb-like transcriptional factors (tvMyb1, tvMyb2 and tvMyb3) that differentially bind to two closely spaced promoter sites, MRE-1/MRE-2r and MRE-2f. Here, we defined a fragment of tvMyb2 comprising residues 40–156 (tvMyb240–156) as the minimum structural unit that retains near full binding affinity with the promoter DNAs. Like c-Myb in vertebrates, the DNA-free tvMyb240–156 has a flexible and open conformation. Upon binding to the promoter DNA elements, tvMyb240–156 undergoes significant conformational re-arrangement and structure stabilization. Crystal structures of tvMyb240–156 in complex with promoter element-containing DNA oligomers showed that 5′-a/gACGAT-3′ is the specific base sequence recognized by tvMyb240–156, which does not fully conform to that of the Myb binding site sequence. Furthermore, Lys49, which is upstream of the R2 motif (amino acids 52–102) also participates in specific DNA sequence recognition. Intriguingly, tvMyb240–156 binds to the promoter elements in an orientation opposite to that proposed in the HADDOCK model of the tvMyb135–141/MRE-1-MRE-2r complex. These results shed new light on understanding the molecular mechanism of Myb–DNA recognition and provide a framework to study the molecular basis of transcriptional regulation of myriad Mybs in T. vaginalis.
Transcription factors (TFs) and their binding sites (TFBSs) play a central role in the regulation of gene expression. It is therefore vital to know how the allocation pattern of TFBSs affects the functioning of any particular gene in vivo. A widely used method to analyze TFBSs in vivo is the chromatin immunoprecipitation (ChIP). However, this method in its present state does not enable the individual investigation of densely arranged TFBSs due to the underlying unspecific DNA fragmentation technique. This study describes a site-specific ChIP which aggregates the benefits of both EMSA and in vivo footprinting in only one assay, thereby allowing the individual detection and analysis of single binding motifs.
The standard ChIP protocol was modified by replacing the conventional DNA fragmentation, i. e. via sonication or undirected enzymatic digestion (by MNase), through a sequence specific enzymatic digestion step. This alteration enables the specific immunoprecipitation and individual examination of occupied sites, even in a complex system of adjacent binding motifs in vivo. Immunoprecipitated chromatin was analyzed by PCR using two primer sets - one for the specific detection of precipitated TFBSs and one for the validation of completeness of the enzyme digestion step. The method was established exemplary for Sp1 TFBSs within the egfr promoter region. Using this site-specific ChIP, we were able to confirm four previously described Sp1 binding sites within egfr promoter region to be occupied by Sp1 in vivo. Despite the dense arrangement of the Sp1 TFBSs the improved ChIP method was able to individually examine the allocation of all adjacent Sp1 TFBS at once. The broad applicability of this site-specific ChIP could be demonstrated by analyzing these SP1 motifs in both osteosarcoma cells and kidney carcinoma tissue.
The ChIP technology is a powerful tool for investigating transcription factors in vivo, especially in cancer biology. The established site-specific enzyme digestion enables a reliable and individual detection option for densely arranged binding motifs in vivo not provided by e.g. EMSA or in vivo footprinting. Given the important function of transcription factors in neoplastic mechanism, our method enables a broad diversity of application options for clinical studies.
The v-myb oncogene and its cellular homolog c-myb encode sequence-specific DNA-binding proteins which regulate transcription from promoters containing Myb-binding sites in animal cells. We have developed a Saccharomyces cerevisiae system to assay transcriptional activation by v-Myb and c-Myb. In yeast strains containing integrated reporter genes, activation was strictly dependent upon both the Myb DNA-binding domain and the Myb recognition element. BAS1, an endogenous Myb-related yeast protein, was not required for transactivation by animal Myb proteins and by itself had no detectable effect on a Myb reporter gene. Deletion analyses demonstrated that a domain of v-Myb C terminal to the previously mapped Myb transcriptional activation domain was required for transactivation in animal cells but not in S. cerevisiae. The same domain is also required for the efficient transformation of myeloid cells by v-Myb. In contrast to results in animal cells, in S. cerevisiae the full-length c-Myb was a much stronger transactivator than a protein bearing the oncogenic N- and C-terminal truncations of v-Myb. These results imply that negative regulation of c-Myb by its own termini requires an additional animal cell protein or small molecule that is not present in S. cerevisiae.
Myb genes from Arabidopsis and rice were clustered into subgroups. The distribution of introns in the phylogenetic tree suggests that introns were inserted during evolution.
Myb proteins contain a conserved DNA-binding domain composed of one to four repeat motifs (referred to as R0R1R2R3); each repeat is approximately 50 amino acids in length, with regularly spaced tryptophan residues. Although the Myb proteins comprise one of the largest families of transcription factors in plants, little is known about the functions of most Myb genes. Here we use computational techniques to classify Myb genes on the basis of sequence similarity and gene structure, and to identify possible functional relationships among subgroups of Myb genes from Arabidopsis and rice (Oryza sativa L. ssp. indica).
This study analyzed 130 Myb genes from Arabidopsis and 85 from rice. The collected Myb proteins were clustered into subgroups based on sequence similarity and phylogeny. Interestingly, the exon-intron structure differed between subgroups, but was conserved in the same subgroup. Moreover, the Myb domains contained a significant excess of phase 1 and 2 introns, as well as an excess of nonsymmetric exons. Conserved motifs were detected in carboxy-terminal coding regions of Myb genes within subgroups. In contrast, no common regulatory motifs were identified in the noncoding regions. Additionally, some Myb genes with similar functions were clustered in the same subgroups.
The distribution of introns in the phylogenetic tree suggests that Myb domains originally were compact in size; introns were inserted and the splicing sites conserved during evolution. Conserved motifs identified in the carboxy-terminal regions are specific for Myb genes, and the identified Myb gene subgroups may reflect functional conservation.
Plant microRNAs (miRNAs) are critical regulators of gene expression, however little attention has been given to the principles governing miRNA silencing efficacy. Here, we utilize the highly conserved Arabidopsis miR159-MYB33/MYB65 regulatory module to explore these principles. Firstly, we show that perfect central complementarity is not required for strong silencing. Artificial miR159 variants with two cleavage site mismatches can potently silence MYB33/MYB65, fully complementing a loss-of-function mir159 mutant. Moreover, these miR159 variants can cleave MYB33/MYB65 mRNA, however cleavage appears attenuated, as the ratio of cleavage products to full length transcripts decreases with increasing central mismatches. Nevertheless, high levels of un-cleaved MYB33/MYB65 transcripts are strongly silenced by a non-cleavage mechanism. Contrary to MIR159a variants that strongly silenced endogenous MYB33/MYB65, artificial MYB33 variants with central mismatches to miR159 are not efficiently silenced. We demonstrate that differences in the miRNA:target mRNA stoichiometry underlie this paradox. Increasing miR159 abundance in the MYB33 variants results in a strong silencing outcome, whereas increasing MYB33 transcript levels in the MIR159a variants results in a poor silencing outcome. Finally, we identify highly conserved nucleotides that flank the miR159 binding site in MYB33, and demonstrate that they are critical for efficient silencing, as mutation of these flanking nucleotides attenuates silencing at a level similar to that of central mismatches. This implies that the context in which the miRNA binding site resides is a key determinant in controlling the degree of silencing and that a miRNA “target site” encompasses sequences that extend beyond the miRNA binding site. In conclusion, our findings dismiss the notion that miRNA:target complementarity, underpinned by central matches, is the sole dictator of the silencing outcome.
In plants, microRNAs (miRNAs) are critical regulators of gene expression. As most validated targets are of high complementarity, whose transcripts are cleaved by the miRNA, both complementarity and cleavage are thought to be the major factors determining the degree to which a target gene is silenced. Here, we explore this principle utilizing the highly conserved miR159-MYB33/MYB65 regulatory module in the model flowering plant Arabidopsis. Firstly, we demonstrate that perfect central complementarity facilitates efficient transcript cleavage but is not required for a strong silencing outcome, as miR159 variants with two central mismatches can recognize and silence MYB33/MYB65 effectively in planta. Driving this silencing is a potent miR159-mediated non-cleavage mechanism that ensures total silencing even when MYB33 transcript levels are very high. Secondly, we demonstrate that the stoichiometric ratio of miRNA to target mRNA is a critical determinant of a silencing outcome, and that ratio becomes increasingly important for inefficient miRNA-target interactions. Finally, we show that nucleotides flanking the miR159 binding site of MYB33 are essential for efficient silencing, demonstrating that the sequence context in which the miRNA target site resides in has a major impact on the silencing outcome. Together, we have shown that although high complementarity underpinned by efficient transcript cleavage may be a prerequisite for a strong silencing outcome, many additional factors that modulate the strength of the miRNA-target interaction are at play. These findings will have ramifications for bioinformatics prediction of miRNA targets and design of artificial miRNAs.
Mammalian spermatogenesis involves formation of haploid cells from the male germline and then a complex morphological transformation to generate motile sperm. Focusing on meiotic prophase, some tissue-specific transcription factors are known (A-MYB) or suspected (RFX2) to play important roles in modulating gene expression in pachytene spermatocytes. The current work was initiated to identify both downstream and upstream regulatory connections for Rfx2.
Searches of pachytene up-regulated genes identified high affinity RFX binding sites (X boxes) in promoter regions of several new genes: Adam5, Pdcl2, and Spag6. We confirmed a strong promoter-region X-box for Alf, a germ cell-specific variant of general transcription factor TFIIA. Using Alf as an example of a target gene, we showed that its promoter is stimulated by RFX2 in transfected cells and used ChIP analysis to show that the promoter is occupied by RFX2 in vivo. Turning to upstream regulation of the Rfx2 promoter, we identified a cluster of three binding sites (MBS) for the MYB family of transcription factors. Because testis is one of the few sites of A-myb expression, and because spermatogenesis arrests in pachytene in A-myb knockout mice, the MBS cluster implicates Rfx2 as an A-myb target. Electrophoretic gel-shift, ChIP, and co-transfection assays all support a role for these MYB sites in Rfx2 expression. Further, Rfx2 expression was virtually eliminated in A-myb knockout testes. Immunohistology on testis sections showed that A-MYB expression is up-regulated only after pachytene spermatocytes have clearly moved away from the tubule wall, which correlates with onset of RFX2 expression, whereas B-MYB expression, by contrast, is prevalent only in earlier spermatocytes and spermatogonia.
With an expanding list of likely target genes, RFX2 is potentially an important transcriptional regulator in pachytene spermatocytes. Rfx2 itself is a good candidate to be regulated by A-MYB, which is essential for meiotic progression. If Alf is a genuine RFX2 target, then A-myb, Rfx2, and Alf may form part of a transcriptional network that is vital for completion of meiosis and preparation for post-meiotic differentiation.
Transcriptional enhancers integrate the contributions of multiple classes of transcription factors (TFs) to orchestrate the myriad spatio-temporal gene expression programs that occur during development. A molecular understanding of enhancers with similar activities requires the identification of both their unique and their shared sequence features. To address this problem, we combined phylogenetic profiling with a DNA–based enhancer sequence classifier that analyzes the TF binding sites (TFBSs) governing the transcription of a co-expressed gene set. We first assembled a small number of enhancers that are active in Drosophila melanogaster muscle founder cells (FCs) and other mesodermal cell types. Using phylogenetic profiling, we increased the number of enhancers by incorporating orthologous but divergent sequences from other Drosophila species. Functional assays revealed that the diverged enhancer orthologs were active in largely similar patterns as their D. melanogaster counterparts, although there was extensive evolutionary shuffling of known TFBSs. We then built and trained a classifier using this enhancer set and identified additional related enhancers based on the presence or absence of known and putative TFBSs. Predicted FC enhancers were over-represented in proximity to known FC genes; and many of the TFBSs learned by the classifier were found to be critical for enhancer activity, including POU homeodomain, Myb, Ets, Forkhead, and T-box motifs. Empirical testing also revealed that the T-box TF encoded by org-1 is a previously uncharacterized regulator of muscle cell identity. Finally, we found extensive diversity in the composition of TFBSs within known FC enhancers, suggesting that motif combinatorics plays an essential role in the cellular specificity exhibited by such enhancers. In summary, machine learning combined with evolutionary sequence analysis is useful for recognizing novel TFBSs and for facilitating the identification of cognate TFs that coordinate cell type–specific developmental gene expression patterns.
The development of multicellular organisms requires the formation of a diversity of cell types. Each cell has a unique genetic program that is orchestrated by regulatory sequences called enhancers, comprising multiple short DNA sequences that bind distinct transcription factors. Understanding developmental regulatory networks requires knowledge of the sequence features of functionally related enhancers. We developed an integrated evolutionary and computational approach for deciphering enhancer regulatory codes and applied this method to discover new components of the transcriptional network controlling muscle development in the fruit fly, Drosophila melanogaster. Our method involves assembling known muscle enhancers, expanding this set with evolutionarily conserved sequences, computationally classifying these enhancers based on their shared sequence features, and scanning the entire Drosophila genome to predict additional related enhancers. Using this approach, we created a map of 5,500 putative muscle enhancers, identified candidate transcription factors to which they bind, observed a strong correlation between mapped enhancers and muscle gene expression, and uncovered extensive heterogeneity among combinations of transcription factor binding sites in validated muscle enhancers, a feature that may contribute to the individual cellular specificities of these regulatory elements. Our strategy can readily be generalized to study transcriptional networks in other organisms and developmental contexts.
The MYB superfamily constitutes one of the most abundant groups of transcription factors described in plants. Nevertheless, their functions appear to be highly diverse and remain rather unclear. To date, no genome-wide characterization of this gene family has been conducted in a legume species. Here we report the first genome-wide analysis of the whole MYB superfamily in a legume species, soybean (Glycine max), including the gene structures, phylogeny, chromosome locations, conserved motifs, and expression patterns, as well as a comparative genomic analysis with Arabidopsis.
A total of 244 R2R3-MYB genes were identified and further classified into 48 subfamilies based on a phylogenetic comparative analysis with their putative orthologs, showed both gene loss and duplication events. The phylogenetic analysis showed that most characterized MYB genes with similar functions are clustered in the same subfamily, together with the identification of orthologs by synteny analysis, functional conservation among subgroups of MYB genes was strongly indicated. The phylogenetic relationships of each subgroup of MYB genes were well supported by the highly conserved intron/exon structures and motifs outside the MYB domain. Synonymous nucleotide substitution (dN/dS) analysis showed that the soybean MYB DNA-binding domain is under strong negative selection. The chromosome distribution pattern strongly indicated that genome-wide segmental and tandem duplication contribute to the expansion of soybean MYB genes. In addition, we found that ~ 4% of soybean R2R3-MYB genes had undergone alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Comparative expression profile analysis of R2R3-MYB genes in soybean and Arabidopsis revealed that MYB genes play conserved and various roles in plants, which is indicative of a divergence in function.
In this study we identified the largest MYB gene family in plants known to date. Our findings indicate that members of this large gene family may be involved in different plant biological processes, some of which may be potentially involved in legume-specific nodulation. Our comparative genomics analysis provides a solid foundation for future functional dissection of this family gene.
The c-myb promoter contains multiple GGA repeats beginning 17 bp downstream of the transcription initiation site. GGA repeats have been previously shown to form unusual DNA structures in solution. Results from chemical footprinting, circular dichroism and RNA and DNA polymerase arrest assays on oligonucleotides representing the GGA repeat region of the c-myb promoter demonstrate that the element is able to form tetrad:heptad:heptad:tetrad (T:H:H:T) G-quadruplex structures by stacking two tetrad:heptad G-quadruplexes formed by two of the three (GGA)4 repeats. Deletion of one or two (GGA)4 motifs destabilizes this secondary structure and increases c-myb promoter activity, indicating that the G-quadruplexes formed in the c-myb GGA repeat region may act as a negative regulator of the c-myb promoter. Complete deletion of the c-myb GGA repeat region abolishes c-myb promoter activity, indicating dual roles of the c-myb GGA repeat element as both a transcriptional repressor and an activator. Furthermore, we demonstrated that Myc-associated zinc finger protein (MAZ) represses c-myb promoter activity and binds to the c-myb T:H:H:T G-quadruplexes. Our findings show that the T:H:H:T G-quadruplex-forming region in the c-myb promoter is a critical cis-acting element and may repress c-myb promoter activity through MAZ interaction with G-quadruplexes in the c-myb promoter.
Wood is mainly composed of secondary walls, which constitute the most abundant stored carbon produced by vascular plants. Understanding the molecular mechanisms controlling secondary wall deposition during wood formation is not only an important issue in plant biology but also critical for providing molecular tools to custom-design wood composition suited for diverse end uses. Past molecular and genetic studies have revealed a transcriptional network encompassing a group of wood-associated NAC and MYB transcription factors that are involved in the regulation of the secondary wall biosynthetic program during wood formation in poplar trees. Here, we report the functional characterization of poplar orthologs of MYB46 and MYB83 that are known to be master switches of secondary wall biosynthesis in Arabidopsis. In addition to the two previously-described PtrMYB3 and PtrMYB20, two other MYBs, PtrMYB2 and PtrMYB21, were shown to be MYB46/MYB83 orthologs by complementation and overexpression studies in Arabidopsis. The functional roles of these PtrMYBs in regulating secondary wall biosynthesis were further demonstrated in transgenic poplar plants showing an ectopic deposition of secondary walls in PtrMYB overexpressors and a reduction of secondary wall thickening in their dominant repressors. Furthermore, PtrMYB2/3/20/21 together with two other tree MYBs, the Eucalyptus EgMYB2 and the pine PtMYB4, were shown to differentially bind to and activate the eight variants of the 7-bp SMRE consensus sequence, composed of ACC(A/T)A(A/C)(T/C). Together, our results indicate that the tree MYBs, PtrMYB2/3/20/21, EgMYB2 and PtMYB4, are master transcriptional switches that activate the SMRE sites in the promoters of target genes and thereby regulate secondary wall biosynthesis during wood formation.
A paradox of plant hormone biology is how a single small molecule can affect a diverse array of growth and developmental processes. For instance, brassinosteroids (BRs) regulate cell elongation, vascular differentiation, senescence and stress responses. BRs signal through the BES1/BZR1 (bri1-EMS-suppressor 1/Brassinazole-Resistant 1) family of transcription factors, which regulate hundreds of target genes involved in this pathway; yet little is known of this transcriptional network. By microarray and chromatin immunoprecipitation (ChIP) experiments, we identified a direct target gene of BES1, AtMYB30, which encodes a MYB family transcription factor. AtMYB30 null mutants display decreased BR responses and can enhance the dwarf phenotype of a weak allele of the BR receptor mutant bri1. Many BR-regulated genes have reduced expression and/or hormone-induction in AtMYB30 mutants, indicating that AtMYB30 functions to promote the expression of a subset of BR-target genes. AtMYB30 and BES1 bind to a conserved MYB-binding site and E-box sequences, respectively, in the promoters of genes that are regulated by both BRs and AtMYB30. Finally, AtMYB30 and BES1 interact with each other both in vitro and in vivo. These results demonstrated that BES1 and AtMYB30 function cooperatively to promote BR target gene expression. Our results therefore establish a new mechanism by which AtMYB30, a direct target of BES1, functions to amplify BR signaling by helping BES1 activate downstream target genes.
Brassinosteroids; BES1; MYB30; transcription; target genes
Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators.
The nuclear proto-oncogene c-myb is preferentially expressed in lymphohematopoietic cells, in which it plays an important role in the processes of differentiation and proliferation. The mechanism(s) that regulates c-myb expression is not fully understood, although in mouse cells a regulatory mechanism involves a transcriptional block in the first intron. To analyze the contribution of the 5' flanking sequences in regulating the expression of the human c-myb gene, we isolated a genomic clone containing extensive 5' flanking sequences, the first exon, and a large portion of the first intron. Sequence analysis of a subcloned 1.3-kb BamHI insert corresponding to 687 nucleotides of the 5' flanking sequence, the entire first exon, and 300 nucleotides of the first intron revealed the presence of closely spaced putative Myb binding sites within a segment extending from nucleotides -616 to -575 upstream from the cap site. A 165-bp segment containing these putative Myb binding sites was linked to a human thymidine kinase (TK) cDNA driven by a low-activity proliferating cell nuclear antigen promoter and cotransfected into TK- ts13 cells with a plasmid in which a full-length human c-myb cDNA is driven by the early simian virus 40 promoter; Myb inducibility of TK mRNA expression was observed both in transient expression assays and in stable transformants. The highest level of inducibility was detected when the 165-bp fragment was placed 138 bp upstream of the proliferating cell nuclear antigen promoter-TK cDNA reporter unit or 3' of the TK cDNA. Mutation of the putative Myb binding sites greatly reduced c-myb transactivation of TK mRNA expression and specifically reduced the binding of in vitro-translated Myb protein at those sites. Finally, c-myb transactivated TK mRNA expression driven by a segment of the authentic c-myb 5' flanking region containing the Myb binding sites. These data suggest that human c-myb maintains high levels of Myb protein in cells that require this gene product for proliferation and/or differentiation by an autoregulatory mechanism involving Myb binding sites in the 5' flanking region.
miR828 in Arabidopsis triggers the cleavage of Trans-Acting SiRNA Gene 4 (TAS4) transcripts and production of small interfering RNAs (ta-siRNAs). One siRNA, TAS4-siRNA81(−), targets a set of MYB transcription factors including PAP1, PAP2, and MYB113 which regulate the anthocyanin biosynthesis pathway. Interestingly, miR828 also targets MYB113, suggesting a close relationship between these MYBs, miR828, and TAS4, but their evolutionary origins are unknown. We found that PAP1, PAP2, and TAS4 expression is induced specifically by exogenous treatment with sucrose and glucose in seedlings. The induction is attenuated in abscisic acid (ABA) pathway mutants, especially in abi3-1 and abi5-1 for PAP1 or PAP2, while no such effect is observed for TAS4. PAP1 is under regulation by TAS4, demonstrated by the accumulation of PAP1 transcripts and anthocyanin in ta-siRNA biogenesis pathway mutants. TAS4-siR81(−) expression is induced by physiological concentrations of Suc and Glc and in pap1-D, an activation-tagged line, indicating a feedback regulatory loop exists between PAP1 and TAS4. Bioinformatic analysis revealed MIR828 homologues in dicots and gymnosperms, but only in one basal monocot, whereas TAS4 is only found in dicots. Consistent with this observation, PAP1, PAP2, and MYB113 dicot paralogs show peptide and nucleotide footprints for the TAS4-siR81(−) binding site, providing evidence for purifying selection in contrast to monocots. Extended sequence similarities between MIR828, MYBs, and TAS4 support an inverted duplication model for the evolution of MIR828 from an ancestral gymnosperm MYB gene and subsequent formation of TAS4 by duplication of the miR828* arm. We obtained evidence by modified 5′-RACE for a MYB mRNA cleavage product guided by miR828 in Pinus resinosa. Taken together, our results suggest that regulation of anthocyanin biosynthesis by TAS4 and miR828 in higher plants is evolutionarily significant and consistent with the evolution of TAS4 since the dicot—monocot divergence.
PAP1; TAS4; miR828; Sugar response; Feedback regulation; TAS evolution
AtMYB44 is a member of the R2R3 MYB subgroup 22 transcription factors and regulates diverse cellular responses in Arabidopsis thaliana. We performed quadruple 9-mer-based protein binding microarray (PBM) analysis, which revealed that full-size AtMYB44 recognized and bound to the consensus sequence AACnG, where n represents A, G, C or T. The consensus sequence was confirmed by electrophoretic mobility shift assay (EMSA) with a truncated AtMYB44 protein containing the N-terminal side R2R3 domain. This result indicates that the R2R3 domain alone is sufficient to exhibit AtMYB44 binding specificity. The sequence AACnG is the type I binding site for MYB transcription factors, including all members of the subgroup 22. EMSA showed that the R2R3 domain protein binds in vitro to promoters of randomly selected Arabidopsis genes that contain the consensus binding sequence. This implies that AtMYB44 binds to any promoter region that contains the consensus sequence, without determining their functional activity or specificity. The C-terminal side transcriptional activation domain of AtMYB44 contains an asparagine-rich fragment, NINNTTSSRHNHNN (aa 215–228), which, among the members of subgroup 22, is unique to AtMYB44. A transcriptional activation assay in yeast showed that this fragment is included in a region (aa 200–240) critical for the ability of AtMYB44 to function as a transcriptional activator. We hypothesize that the C-terminal side of the protein, but not the N-terminal side of the R2R3 domain, contributes to the functional activity and specificity of AtMYB44 through interactions with other regulators generated by each of a variety of stimuli.
Arabidopsis; AtMYB44; protein binding microarray; protein domain; transcription factor
The Myb family of transcription factors is defined by homology within the DNA binding domain and includes c-Myb, A-Myb, and B-Myb. The protein products of the myb genes all bind the Myb-binding site (MBS) [YG(A/G)C(A/C/G)GTT(G/A)]. A-myb has been found to display a limited pattern of expression. Here we report that bovine aortic smooth muscle cells (SMCs) express A-myb. Sequence analysis of isolated bovine A-myb cDNA clones spanning the entire coding region indicated extensive homology with the human gene, including the putative transactivation domain. Expression of A-myb was cell cycle dependent; levels of A-myb RNA increased in the late G1-to-S phase transition following serum stimulation of serum-deprived quiescent SMC cultures and peaked in S phase. Nuclear run-on analysis revealed that an increased rate of transcription can account for most of the increase in A-myb RNA levels. Treatment of SMC cultures with 5,6-dichlorobenzimidazole riboside, a selective inhibitor of RNA polymerase II, indicated an approximate 4-h half-life for A-myb mRNA during the S phase of the cell cycle. Expression of A-myb by SMCs was stimulated by basic fibroblast growth factor, in a cell density-dependent fashion. Cotransfection of a human A-myb expression vector activated a multimerized MBS element-driven reporter construct approximately 30-fold in SMCs. The activity of c-myb and c-myc promoters, which both contain multiple MBS elements, were similarly transactivated, approximately 30- and 50-fold, respectively, upon cotransfection with human A-myb. Lastly, A-myb RNA levels could be increased by a combination of phorbol ester plus insulin-like growth factor 1. To test the role of myb family members in progression through the cell cycle, we comicroinjected c-myc and myb expression vectors into serum-deprived quiescent SMCs. The combination of c-myc and either A-myb or c-myb but not B-myb synergistically led to entry into S phase, whereas microinjection of any vector alone had little effect on S phase entry. Thus, these results suggest that A-myb is a potent transactivator in bovine SMCs and that its expression induces progression into S phase of the cell cycle.
Transcription factor (TF) binding sites (cis element) play a central role in gene regulation, and eukaryotic organisms frequently adapt a combinatorial regulation to render sophisticated local gene expression patterns. Knowing the precise cis element on a distal promoter is a prerequisite for studying a typical transcription process; however, identifications of cis elements have lagged behind those of their associated trans acting TFs due to technical difficulties. Consequently, gene regulations via combinatorial TFs, as widely observed across biological processes, have remained vague in many cases.
We present here a valid strategy for identifying cis elements in combinatorial TF regulations. It consists of bioinformatic searches of available databases to generate candidate cis elements and tests of the candidates using improved experimental assays. Taking the MYB and the bHLH that collaboratively regulate the anthocyanin pathway genes as examples, we demonstrate how candidate cis motifs for the TFs are found on multi-specific promoters of chalcone synthase (CHS) genes, and how to experimentally test the candidate sites by designing DNA fragments hosting the candidate motifs based on a known promoter (us1 allele of Ipomoea purpurea CHS-D in our case) and applying site-mutagenesis at the motifs. It was shown that TF-DNA interactions could be unambiguously analyzed by assays of electrophoretic mobility shift (EMSA) and dual-luciferase transient expressions, and the resulting evidence precisely delineated a cis element. The cis element for R2R3 MYBs including Ipomoea MYB1 and Magnolia MYB1, for instance, was found to be ANCNACC, and that for bHLHs (exemplified by Ipomoea bHLH2 and petunia AN1) was CACNNG. A re-analysis was conducted on previously reported promoter segments recognized by maize C1 and apple MYB10, which indicated that cis elements similar to ANCNACC were indeed present on these segments, and tested positive for their bindings to Ipomoea MYB1.
Identification of cis elements in combinatorial regulation is now feasible with the strategy outlined. The working pipeline integrates the existing databases with experimental techniques, providing an open framework for precisely identifying cis elements. This strategy is widely applicable to various biological systems, and may enhance future analyses on gene regulation.
cis element; MYB; bHLH; EMSA; Dual-luciferase transient expression assay
The c-myb proto-oncogene is the founding member of a family of transcription factors involved principally in haematopoiesis, in diverse organisms, from zebrafish to mammals. Its deregulation has been implicated in human leukaemogenesis and other cancers. The expression of c-myb is tightly regulated by post-transcriptional mechanisms involving microRNAs. MicroRNAs are small, highly conserved non-coding RNAs that inhibit translation and decrease mRNA stability by binding to regulatory motifs mostly located in the 3'UTR of target mRNAs conserved throughout evolution. MYB is an evolutionarily conserved miR-150 target experimentally validated in mice, humans and zebrafish. However, the functional miR-150 sites of humans and mice are orthologous, whereas that of zebrafish is different.
We identified the avian mature miRNA-150-5P, Gallus gallus gga-miR-150 from chicken leukocyte small-RNA libraries and showed that, as expected, the gga-miR-150 sequence was highly conserved, including the seed region sequence present in the other miR-150 sequences listed in miRBase. Reporter assays showed that gga-miR-150 acted on the avian MYB 3'UTR and identified the avian MYB target site involved in gga-miR-150 binding. A comparative in silico analysis of the miR-150 target sites of MYB 3'UTRs from different species led to the identification of a single set of putative target sites in amphibians and zebrafish, whereas two sets of putative target sites were identified in chicken and mammals. However, only the target site present in the chicken MYB 3'UTR that was identical to that in zebrafish was functional, despite the additional presence of mammalian target sites in chicken. This specific miR-150 site usage was not cell-type specific and persisted when the chicken c-myb 3'UTR was used in the cell system to identify mammalian target sites, showing that this miR-150 target site usage was intrinsic to the chicken c-myb 3'UTR.
Our study of the avian MYB/gga-miR-150 interaction shows a conservation of miR-150 target site functionality between chicken and zebrafish that does not extend to mammals.
Chromatin is a dynamic but highly regulated structure. DNA-binding proteins such as transcription factors, epigenetic and chromatin modifiers are responsible for regulating specific gene expression pattern and may result in different phenotypes. To reveal the identity of the proteins associated with the specific region on DNA, chromatin immunoprecipitation (ChIP) is the most widely used technique. ChIP assay followed by next generation sequencing (ChIP-seq) or microarray (ChIP-chip) is often used to study patterns of protein-binding profiles in different cell types and in cancer samples on a genome-wide scale. However, only a limited number of bioinformatics tools are available for ChIP datasets analysis.
We present ChIPseek, a web-based tool for ChIP data analysis providing summary statistics in graphs and offering several commonly demanded analyses. ChIPseek can provide statistical summary of the dataset including histogram of peak length distribution, histogram of distances to the nearest transcription start site (TSS), and pie chart (or bar chart) of genomic locations for users to have a comprehensive view on the dataset for further analysis. For examining the potential functions of peaks, ChIPseek provides peak annotation, visualization of peak genomic location, motif identification, sequence extraction, and comparison between datasets. Beyond that, ChIPseek also offers users the flexibility to filter peaks and re-analyze the filtered subset of peaks. ChIPseek supports 20 different genome assemblies for 12 model organisms including human, mouse, rat, worm, fly, frog, zebrafish, chicken, yeast, fission yeast, Arabidopsis, and rice. We use demo datasets to demonstrate the usage and intuitive user interface of ChIPseek.
ChIPseek provides a user-friendly interface for biologists to analyze large-scale ChIP data without requiring any programing skills. All the results and figures produced by ChIPseek can be downloaded for further analysis. The analysis tools built into ChIPseek, especially the ones for selecting and examine a subset of peaks from ChIP data, provides invaluable helps for exploring the high through-put data from either ChIP-seq or ChIP-chip. ChIPseek is freely available at http://chipseek.cgu.edu.tw.
ChIP-seq; ChIP-chip; Analysis tool; Web-services; Peak annotation; Motif identification; Filter tools; Comparison
The c-Myb transcription factor is an important regulator of hematopoietic cell development. c-Myb is expressed in immature hematopoietic cells and plays a direct role in lineage fate selection, cell cycle progression, and differentiation of myeloid as well as B- and T-lymphoid progenitor cells. As a DNA-binding transcription factor, c-Myb regulates specific gene programs through activation of target genes. Still, our understanding of these programs is incomplete. Here, we report a set of novel c-Myb target genes, identified using a combined approach: specific c-Myb knockdown by 2 different siRNAs and subsequent global expression profiling, combined with the confirmation of direct binding of c-Myb to the target promoters by ChIP assays. The combination of these 2 approaches, as well as additional validation such as cloning and testing the promoters in reporter assays, confirmed that MYADM, LMO2, GATA2, STAT5A, and IKZF1 are target genes of c-Myb. Additional studies, using chromosome conformation capture, demonstrated that c-Myb target genes may directly interact with each other, indicating that these genes may be coordinately regulated. Of the 5 novel target genes identified, 3 are transcription factors, and one is a transcriptional co-regulator, supporting a role of c-Myb as a master regulator controlling the expression of other transcriptional regulators in the hematopoietic system.
c-Myb; hematopoiesis; transcription factors; MYADM; LMO2; GATA2; STAT5A; IKZF1; 3C; chromosome conformation capture