The MYB gene family comprises one of the richest groups of transcription factors in plants. Plant MYB proteins are characterized by a highly conserved MYB DNA-binding domain. MYB proteins are classified into four major groups namely, 1R-MYB, 2R-MYB, 3R-MYB and 4R-MYB based on the number and position of MYB repeats. MYB transcription factors are involved in plant development, secondary metabolism, hormone signal transduction, disease resistance and abiotic stress tolerance. A comparative analysis of MYB family genes in rice and Arabidopsis will help reveal the evolution and function of MYB genes in plants.
A genome-wide analysis identified at least 155 and 197 MYB genes in rice and Arabidopsis, respectively. Gene structure analysis revealed that MYB family genes possess relatively more number of introns in the middle as compared with C- and N-terminal regions of the predicted genes. Intronless MYB-genes are highly conserved both in rice and Arabidopsis. MYB genes encoding R2R3 repeat MYB proteins retained conserved gene structure with three exons and two introns, whereas genes encoding R1R2R3 repeat containing proteins consist of six exons and five introns. The splicing pattern is similar among R1R2R3 MYB genes in Arabidopsis. In contrast, variation in splicing pattern was observed among R1R2R3 MYB members of rice. Consensus motif analysis of 1kb upstream region (5′ to translation initiation codon) of MYB gene ORFs led to the identification of conserved and over-represented cis-motifs in both rice and Arabidopsis. Real-time quantitative RT-PCR analysis showed that several members of MYBs are up-regulated by various abiotic stresses both in rice and Arabidopsis.
A comprehensive genome-wide analysis of chromosomal distribution, tandem repeats and phylogenetic relationship of MYB family genes in rice and Arabidopsis suggested their evolution via duplication. Genome-wide comparative analysis of MYB genes and their expression analysis identified several MYBs with potential role in development and stress response of plants.
The transcription regulatory properties of murine B-myb protein were compared to those of c-myb. Whereas c-Myb trans-activated an SV40 early promoter containing multiple copies of an upstream c-Myb DNA-binding site (MBS-1), and similarly the human c-myc promoter, B-Myb was unable to do so. Full-length B-Myb translated in vitro did not bind MBS-1; however, truncation of the B-Myb C-terminus or fusion of the B-Myb DNA-binding domain to the c-Myb C-terminus showed that it was inherently competent to interact with this motif. Further evidence from co-transfection experiments, demonstrating that B-Myb inhibited trans-activation by c-Myb, suggested that failure of B-Myb to trans-activate these promoters did not simply occur through lack of binding to MBS-1. Moreover, using GAL4/B-Myb fusions, it was found that an acidic region of B-Myb, which by comparison to c-Myb was expected to contain a transcription activation domain, actually had no inherent trans-activation activity and indeed appeared to trans-inhibit c-Myb. In contrast to the above findings, both B-Myb and c-Myb were able to weakly trans-activate the DNA polymerase alpha promoter. Results obtained here demonstrate that the activities of B-Myb and c-Myb are clearly distinct and suggest that these related proteins may have different functions in regulation of target gene expression.
The RAG-2 gene encodes a component of the V(D)J recombinase which is essential for the assembly of antigen receptor genes in B and T lymphocytes. Previously, we reported that the transcription factor BSAP (PAX-5) regulates the murine RAG-2 promoter in B-cell lines. A partially overlapping but distinct region of the proximal RAG-2 promoter was also identified as an important element for promoter activity in T cells; however, the responsible factor was unknown. In this report, we present data demonstrating that c-Myb binds to a Myb consensus site within the proximal promoter and is critical for its activity in T-lineage cells. We show that c-Myb can transactivate a RAG-2 promoter-reporter construct in cotransfection assays and that this transactivation depends on the proximal promoter Myb consensus site. By using a chromatin immunoprecipitation (ChIP) strategy, fractionation of chromatin with anti-c-Myb antibody specifically enriched endogenous RAG-2 promoter DNA sequences. DNase I genomic footprinting revealed that the c-Myb site is occupied in a tissue-specific fashion in vivo. Furthermore, an integrated RAG-2 promoter construct with mutations at the c-Myb site was not enriched in the ChIP assay, while a wild-type integrated promoter construct was enriched. Finally, this lack of binding of c-Myb to a chromosomally integrated mutant RAG-2 promoter construct in vivo was associated with a striking decrease in promoter activity. We conclude that c-Myb regulates the RAG-2 promoter in T cells by binding to this consensus c-Myb binding site.
Finding where transcription factors (TFs) bind to the DNA is of key importance to decipher gene regulation at a transcriptional level. Classically, computational prediction of TF binding sites (TFBSs) is based on basic position weight matrices (PWMs) which quantitatively score binding motifs based on the observed nucleotide patterns in a set of TFBSs for the corresponding TF. Such models make the strong assumption that each nucleotide participates independently in the corresponding DNA-protein interaction and do not account for flexible length motifs. We introduce transcription factor flexible models (TFFMs) to represent TF binding properties. Based on hidden Markov models, TFFMs are flexible, and can model both position interdependence within TFBSs and variable length motifs within a single dedicated framework. The availability of thousands of experimentally validated DNA-TF interaction sequences from ChIP-seq allows for the generation of models that perform as well as PWMs for stereotypical TFs and can improve performance for TFs with flexible binding characteristics. We present a new graphical representation of the motifs that convey properties of position interdependence. TFFMs have been assessed on ChIP-seq data sets coming from the ENCODE project, revealing that they can perform better than both PWMs and the dinucleotide weight matrix extension in discriminating ChIP-seq from background sequences. Under the assumption that ChIP-seq signal values are correlated with the affinity of the TF-DNA binding, we find that TFFM scores correlate with ChIP-seq peak signals. Moreover, using available TF-DNA affinity measurements for the Max TF, we demonstrate that TFFMs constructed from ChIP-seq data correlate with published experimentally measured DNA-binding affinities. Finally, TFFMs allow for the straightforward computation of an integrated TF occupancy score across a sequence. These results demonstrate the capacity of TFFMs to accurately model DNA-protein interactions, while providing a single unified framework suitable for the next generation of TFBS prediction.
Transcription factors are critical proteins for sequence-specific control of transcriptional regulation. Finding where these proteins bind to DNA is of key importance for global efforts to decipher the complex mechanisms of gene regulation. Greater understanding of the regulation of transcription promises to improve human genetic analysis by specifying critical gene components that have eluded investigators. Classically, computational prediction of transcription factor binding sites (TFBS) is based on models giving weights to each nucleotide at each position. We introduce a novel statistical model for the prediction of TFBS tolerant of a broader range of TFBS configurations than can be conveniently accommodated by existing methods. The new models are designed to address the confounding properties of nucleotide composition, inter-positional sequence dependence and variable lengths (e.g. variable spacing between half-sites) observed in the more comprehensive experimental data now emerging. The new models generate scores consistent with DNA-protein affinities measured experimentally and can be represented graphically, retaining desirable attributes of past methods. It demonstrates the capacity of the new approach to accurately assess DNA-protein interactions. With the rich experimental data generated from chromatin immunoprecipitation experiments, a greater diversity of TFBS properties has emerged that can now be accommodated within a single predictive approach.
Iron-inducible transcription of the ap65-1 gene in Trichomonas vaginalis involves at least three Myb-like transcriptional factors (tvMyb1, tvMyb2 and tvMyb3) that differentially bind to two closely spaced promoter sites, MRE-1/MRE-2r and MRE-2f. Here, we defined a fragment of tvMyb2 comprising residues 40–156 (tvMyb240–156) as the minimum structural unit that retains near full binding affinity with the promoter DNAs. Like c-Myb in vertebrates, the DNA-free tvMyb240–156 has a flexible and open conformation. Upon binding to the promoter DNA elements, tvMyb240–156 undergoes significant conformational re-arrangement and structure stabilization. Crystal structures of tvMyb240–156 in complex with promoter element-containing DNA oligomers showed that 5′-a/gACGAT-3′ is the specific base sequence recognized by tvMyb240–156, which does not fully conform to that of the Myb binding site sequence. Furthermore, Lys49, which is upstream of the R2 motif (amino acids 52–102) also participates in specific DNA sequence recognition. Intriguingly, tvMyb240–156 binds to the promoter elements in an orientation opposite to that proposed in the HADDOCK model of the tvMyb135–141/MRE-1-MRE-2r complex. These results shed new light on understanding the molecular mechanism of Myb–DNA recognition and provide a framework to study the molecular basis of transcriptional regulation of myriad Mybs in T. vaginalis.
The v-myb oncogene and its cellular homolog c-myb encode sequence-specific DNA-binding proteins which regulate transcription from promoters containing Myb-binding sites in animal cells. We have developed a Saccharomyces cerevisiae system to assay transcriptional activation by v-Myb and c-Myb. In yeast strains containing integrated reporter genes, activation was strictly dependent upon both the Myb DNA-binding domain and the Myb recognition element. BAS1, an endogenous Myb-related yeast protein, was not required for transactivation by animal Myb proteins and by itself had no detectable effect on a Myb reporter gene. Deletion analyses demonstrated that a domain of v-Myb C terminal to the previously mapped Myb transcriptional activation domain was required for transactivation in animal cells but not in S. cerevisiae. The same domain is also required for the efficient transformation of myeloid cells by v-Myb. In contrast to results in animal cells, in S. cerevisiae the full-length c-Myb was a much stronger transactivator than a protein bearing the oncogenic N- and C-terminal truncations of v-Myb. These results imply that negative regulation of c-Myb by its own termini requires an additional animal cell protein or small molecule that is not present in S. cerevisiae.
Myb genes from Arabidopsis and rice were clustered into subgroups. The distribution of introns in the phylogenetic tree suggests that introns were inserted during evolution.
Myb proteins contain a conserved DNA-binding domain composed of one to four repeat motifs (referred to as R0R1R2R3); each repeat is approximately 50 amino acids in length, with regularly spaced tryptophan residues. Although the Myb proteins comprise one of the largest families of transcription factors in plants, little is known about the functions of most Myb genes. Here we use computational techniques to classify Myb genes on the basis of sequence similarity and gene structure, and to identify possible functional relationships among subgroups of Myb genes from Arabidopsis and rice (Oryza sativa L. ssp. indica).
This study analyzed 130 Myb genes from Arabidopsis and 85 from rice. The collected Myb proteins were clustered into subgroups based on sequence similarity and phylogeny. Interestingly, the exon-intron structure differed between subgroups, but was conserved in the same subgroup. Moreover, the Myb domains contained a significant excess of phase 1 and 2 introns, as well as an excess of nonsymmetric exons. Conserved motifs were detected in carboxy-terminal coding regions of Myb genes within subgroups. In contrast, no common regulatory motifs were identified in the noncoding regions. Additionally, some Myb genes with similar functions were clustered in the same subgroups.
The distribution of introns in the phylogenetic tree suggests that Myb domains originally were compact in size; introns were inserted and the splicing sites conserved during evolution. Conserved motifs identified in the carboxy-terminal regions are specific for Myb genes, and the identified Myb gene subgroups may reflect functional conservation.
Chromatin is a dynamic but highly regulated structure. DNA-binding proteins such as transcription factors, epigenetic and chromatin modifiers are responsible for regulating specific gene expression pattern and may result in different phenotypes. To reveal the identity of the proteins associated with the specific region on DNA, chromatin immunoprecipitation (ChIP) is the most widely used technique. ChIP assay followed by next generation sequencing (ChIP-seq) or microarray (ChIP-chip) is often used to study patterns of protein-binding profiles in different cell types and in cancer samples on a genome-wide scale. However, only a limited number of bioinformatics tools are available for ChIP datasets analysis.
We present ChIPseek, a web-based tool for ChIP data analysis providing summary statistics in graphs and offering several commonly demanded analyses. ChIPseek can provide statistical summary of the dataset including histogram of peak length distribution, histogram of distances to the nearest transcription start site (TSS), and pie chart (or bar chart) of genomic locations for users to have a comprehensive view on the dataset for further analysis. For examining the potential functions of peaks, ChIPseek provides peak annotation, visualization of peak genomic location, motif identification, sequence extraction, and comparison between datasets. Beyond that, ChIPseek also offers users the flexibility to filter peaks and re-analyze the filtered subset of peaks. ChIPseek supports 20 different genome assemblies for 12 model organisms including human, mouse, rat, worm, fly, frog, zebrafish, chicken, yeast, fission yeast, Arabidopsis, and rice. We use demo datasets to demonstrate the usage and intuitive user interface of ChIPseek.
ChIPseek provides a user-friendly interface for biologists to analyze large-scale ChIP data without requiring any programing skills. All the results and figures produced by ChIPseek can be downloaded for further analysis. The analysis tools built into ChIPseek, especially the ones for selecting and examine a subset of peaks from ChIP data, provides invaluable helps for exploring the high through-put data from either ChIP-seq or ChIP-chip. ChIPseek is freely available at http://chipseek.cgu.edu.tw.
ChIP-seq; ChIP-chip; Analysis tool; Web-services; Peak annotation; Motif identification; Filter tools; Comparison
Plant microRNAs (miRNAs) are critical regulators of gene expression, however little attention has been given to the principles governing miRNA silencing efficacy. Here, we utilize the highly conserved Arabidopsis miR159-MYB33/MYB65 regulatory module to explore these principles. Firstly, we show that perfect central complementarity is not required for strong silencing. Artificial miR159 variants with two cleavage site mismatches can potently silence MYB33/MYB65, fully complementing a loss-of-function mir159 mutant. Moreover, these miR159 variants can cleave MYB33/MYB65 mRNA, however cleavage appears attenuated, as the ratio of cleavage products to full length transcripts decreases with increasing central mismatches. Nevertheless, high levels of un-cleaved MYB33/MYB65 transcripts are strongly silenced by a non-cleavage mechanism. Contrary to MIR159a variants that strongly silenced endogenous MYB33/MYB65, artificial MYB33 variants with central mismatches to miR159 are not efficiently silenced. We demonstrate that differences in the miRNA:target mRNA stoichiometry underlie this paradox. Increasing miR159 abundance in the MYB33 variants results in a strong silencing outcome, whereas increasing MYB33 transcript levels in the MIR159a variants results in a poor silencing outcome. Finally, we identify highly conserved nucleotides that flank the miR159 binding site in MYB33, and demonstrate that they are critical for efficient silencing, as mutation of these flanking nucleotides attenuates silencing at a level similar to that of central mismatches. This implies that the context in which the miRNA binding site resides is a key determinant in controlling the degree of silencing and that a miRNA “target site” encompasses sequences that extend beyond the miRNA binding site. In conclusion, our findings dismiss the notion that miRNA:target complementarity, underpinned by central matches, is the sole dictator of the silencing outcome.
In plants, microRNAs (miRNAs) are critical regulators of gene expression. As most validated targets are of high complementarity, whose transcripts are cleaved by the miRNA, both complementarity and cleavage are thought to be the major factors determining the degree to which a target gene is silenced. Here, we explore this principle utilizing the highly conserved miR159-MYB33/MYB65 regulatory module in the model flowering plant Arabidopsis. Firstly, we demonstrate that perfect central complementarity facilitates efficient transcript cleavage but is not required for a strong silencing outcome, as miR159 variants with two central mismatches can recognize and silence MYB33/MYB65 effectively in planta. Driving this silencing is a potent miR159-mediated non-cleavage mechanism that ensures total silencing even when MYB33 transcript levels are very high. Secondly, we demonstrate that the stoichiometric ratio of miRNA to target mRNA is a critical determinant of a silencing outcome, and that ratio becomes increasingly important for inefficient miRNA-target interactions. Finally, we show that nucleotides flanking the miR159 binding site of MYB33 are essential for efficient silencing, demonstrating that the sequence context in which the miRNA target site resides in has a major impact on the silencing outcome. Together, we have shown that although high complementarity underpinned by efficient transcript cleavage may be a prerequisite for a strong silencing outcome, many additional factors that modulate the strength of the miRNA-target interaction are at play. These findings will have ramifications for bioinformatics prediction of miRNA targets and design of artificial miRNAs.
Mammalian spermatogenesis involves formation of haploid cells from the male germline and then a complex morphological transformation to generate motile sperm. Focusing on meiotic prophase, some tissue-specific transcription factors are known (A-MYB) or suspected (RFX2) to play important roles in modulating gene expression in pachytene spermatocytes. The current work was initiated to identify both downstream and upstream regulatory connections for Rfx2.
Searches of pachytene up-regulated genes identified high affinity RFX binding sites (X boxes) in promoter regions of several new genes: Adam5, Pdcl2, and Spag6. We confirmed a strong promoter-region X-box for Alf, a germ cell-specific variant of general transcription factor TFIIA. Using Alf as an example of a target gene, we showed that its promoter is stimulated by RFX2 in transfected cells and used ChIP analysis to show that the promoter is occupied by RFX2 in vivo. Turning to upstream regulation of the Rfx2 promoter, we identified a cluster of three binding sites (MBS) for the MYB family of transcription factors. Because testis is one of the few sites of A-myb expression, and because spermatogenesis arrests in pachytene in A-myb knockout mice, the MBS cluster implicates Rfx2 as an A-myb target. Electrophoretic gel-shift, ChIP, and co-transfection assays all support a role for these MYB sites in Rfx2 expression. Further, Rfx2 expression was virtually eliminated in A-myb knockout testes. Immunohistology on testis sections showed that A-MYB expression is up-regulated only after pachytene spermatocytes have clearly moved away from the tubule wall, which correlates with onset of RFX2 expression, whereas B-MYB expression, by contrast, is prevalent only in earlier spermatocytes and spermatogonia.
With an expanding list of likely target genes, RFX2 is potentially an important transcriptional regulator in pachytene spermatocytes. Rfx2 itself is a good candidate to be regulated by A-MYB, which is essential for meiotic progression. If Alf is a genuine RFX2 target, then A-myb, Rfx2, and Alf may form part of a transcriptional network that is vital for completion of meiosis and preparation for post-meiotic differentiation.
The MYB superfamily constitutes one of the most abundant groups of transcription factors described in plants. Nevertheless, their functions appear to be highly diverse and remain rather unclear. To date, no genome-wide characterization of this gene family has been conducted in a legume species. Here we report the first genome-wide analysis of the whole MYB superfamily in a legume species, soybean (Glycine max), including the gene structures, phylogeny, chromosome locations, conserved motifs, and expression patterns, as well as a comparative genomic analysis with Arabidopsis.
A total of 244 R2R3-MYB genes were identified and further classified into 48 subfamilies based on a phylogenetic comparative analysis with their putative orthologs, showed both gene loss and duplication events. The phylogenetic analysis showed that most characterized MYB genes with similar functions are clustered in the same subfamily, together with the identification of orthologs by synteny analysis, functional conservation among subgroups of MYB genes was strongly indicated. The phylogenetic relationships of each subgroup of MYB genes were well supported by the highly conserved intron/exon structures and motifs outside the MYB domain. Synonymous nucleotide substitution (dN/dS) analysis showed that the soybean MYB DNA-binding domain is under strong negative selection. The chromosome distribution pattern strongly indicated that genome-wide segmental and tandem duplication contribute to the expansion of soybean MYB genes. In addition, we found that ~ 4% of soybean R2R3-MYB genes had undergone alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Comparative expression profile analysis of R2R3-MYB genes in soybean and Arabidopsis revealed that MYB genes play conserved and various roles in plants, which is indicative of a divergence in function.
In this study we identified the largest MYB gene family in plants known to date. Our findings indicate that members of this large gene family may be involved in different plant biological processes, some of which may be potentially involved in legume-specific nodulation. Our comparative genomics analysis provides a solid foundation for future functional dissection of this family gene.
The c-myb promoter contains multiple GGA repeats beginning 17 bp downstream of the transcription initiation site. GGA repeats have been previously shown to form unusual DNA structures in solution. Results from chemical footprinting, circular dichroism and RNA and DNA polymerase arrest assays on oligonucleotides representing the GGA repeat region of the c-myb promoter demonstrate that the element is able to form tetrad:heptad:heptad:tetrad (T:H:H:T) G-quadruplex structures by stacking two tetrad:heptad G-quadruplexes formed by two of the three (GGA)4 repeats. Deletion of one or two (GGA)4 motifs destabilizes this secondary structure and increases c-myb promoter activity, indicating that the G-quadruplexes formed in the c-myb GGA repeat region may act as a negative regulator of the c-myb promoter. Complete deletion of the c-myb GGA repeat region abolishes c-myb promoter activity, indicating dual roles of the c-myb GGA repeat element as both a transcriptional repressor and an activator. Furthermore, we demonstrated that Myc-associated zinc finger protein (MAZ) represses c-myb promoter activity and binds to the c-myb T:H:H:T G-quadruplexes. Our findings show that the T:H:H:T G-quadruplex-forming region in the c-myb promoter is a critical cis-acting element and may repress c-myb promoter activity through MAZ interaction with G-quadruplexes in the c-myb promoter.
Wood is mainly composed of secondary walls, which constitute the most abundant stored carbon produced by vascular plants. Understanding the molecular mechanisms controlling secondary wall deposition during wood formation is not only an important issue in plant biology but also critical for providing molecular tools to custom-design wood composition suited for diverse end uses. Past molecular and genetic studies have revealed a transcriptional network encompassing a group of wood-associated NAC and MYB transcription factors that are involved in the regulation of the secondary wall biosynthetic program during wood formation in poplar trees. Here, we report the functional characterization of poplar orthologs of MYB46 and MYB83 that are known to be master switches of secondary wall biosynthesis in Arabidopsis. In addition to the two previously-described PtrMYB3 and PtrMYB20, two other MYBs, PtrMYB2 and PtrMYB21, were shown to be MYB46/MYB83 orthologs by complementation and overexpression studies in Arabidopsis. The functional roles of these PtrMYBs in regulating secondary wall biosynthesis were further demonstrated in transgenic poplar plants showing an ectopic deposition of secondary walls in PtrMYB overexpressors and a reduction of secondary wall thickening in their dominant repressors. Furthermore, PtrMYB2/3/20/21 together with two other tree MYBs, the Eucalyptus EgMYB2 and the pine PtMYB4, were shown to differentially bind to and activate the eight variants of the 7-bp SMRE consensus sequence, composed of ACC(A/T)A(A/C)(T/C). Together, our results indicate that the tree MYBs, PtrMYB2/3/20/21, EgMYB2 and PtMYB4, are master transcriptional switches that activate the SMRE sites in the promoters of target genes and thereby regulate secondary wall biosynthesis during wood formation.
Transcription factors (TFs) and their binding sites (TFBSs) play a central role in the regulation of gene expression. It is therefore vital to know how the allocation pattern of TFBSs affects the functioning of any particular gene in vivo. A widely used method to analyze TFBSs in vivo is the chromatin immunoprecipitation (ChIP). However, this method in its present state does not enable the individual investigation of densely arranged TFBSs due to the underlying unspecific DNA fragmentation technique. This study describes a site-specific ChIP which aggregates the benefits of both EMSA and in vivo footprinting in only one assay, thereby allowing the individual detection and analysis of single binding motifs.
The standard ChIP protocol was modified by replacing the conventional DNA fragmentation, i. e. via sonication or undirected enzymatic digestion (by MNase), through a sequence specific enzymatic digestion step. This alteration enables the specific immunoprecipitation and individual examination of occupied sites, even in a complex system of adjacent binding motifs in vivo. Immunoprecipitated chromatin was analyzed by PCR using two primer sets - one for the specific detection of precipitated TFBSs and one for the validation of completeness of the enzyme digestion step. The method was established exemplary for Sp1 TFBSs within the egfr promoter region. Using this site-specific ChIP, we were able to confirm four previously described Sp1 binding sites within egfr promoter region to be occupied by Sp1 in vivo. Despite the dense arrangement of the Sp1 TFBSs the improved ChIP method was able to individually examine the allocation of all adjacent Sp1 TFBS at once. The broad applicability of this site-specific ChIP could be demonstrated by analyzing these SP1 motifs in both osteosarcoma cells and kidney carcinoma tissue.
The ChIP technology is a powerful tool for investigating transcription factors in vivo, especially in cancer biology. The established site-specific enzyme digestion enables a reliable and individual detection option for densely arranged binding motifs in vivo not provided by e.g. EMSA or in vivo footprinting. Given the important function of transcription factors in neoplastic mechanism, our method enables a broad diversity of application options for clinical studies.
A paradox of plant hormone biology is how a single small molecule can affect a diverse array of growth and developmental processes. For instance, brassinosteroids (BRs) regulate cell elongation, vascular differentiation, senescence and stress responses. BRs signal through the BES1/BZR1 (bri1-EMS-suppressor 1/Brassinazole-Resistant 1) family of transcription factors, which regulate hundreds of target genes involved in this pathway; yet little is known of this transcriptional network. By microarray and chromatin immunoprecipitation (ChIP) experiments, we identified a direct target gene of BES1, AtMYB30, which encodes a MYB family transcription factor. AtMYB30 null mutants display decreased BR responses and can enhance the dwarf phenotype of a weak allele of the BR receptor mutant bri1. Many BR-regulated genes have reduced expression and/or hormone-induction in AtMYB30 mutants, indicating that AtMYB30 functions to promote the expression of a subset of BR-target genes. AtMYB30 and BES1 bind to a conserved MYB-binding site and E-box sequences, respectively, in the promoters of genes that are regulated by both BRs and AtMYB30. Finally, AtMYB30 and BES1 interact with each other both in vitro and in vivo. These results demonstrated that BES1 and AtMYB30 function cooperatively to promote BR target gene expression. Our results therefore establish a new mechanism by which AtMYB30, a direct target of BES1, functions to amplify BR signaling by helping BES1 activate downstream target genes.
Brassinosteroids; BES1; MYB30; transcription; target genes
Redundancy and competition between R2R3-MYB activators and repressors on common target genes has been proposed as a fine-tuning mechanism for the regulation of plant secondary metabolism. This hypothesis was tested in white spruce [Picea glauca (Moench) Voss] by investigating the effects of R2R3-MYBs from different subgroups on common targets from distinct metabolic pathways. Comparative analysis of transcript profiling data in spruces overexpressing R2R3-MYBs from loblolly pine (Pinus taeda L.), PtMYB1, PtMYB8, and PtMYB14, defined a set of common genes that display opposite regulation effects. The relationship between the closest MYB homologues and 33 putative target genes was explored by quantitative PCR expression profiling in wild-type P. glauca plants during the diurnal cycle. Significant Spearman’s correlation estimates were consistent with the proposed opposite effect of different R2R3-MYBs on several putative target genes in a time-related and tissue-preferential manner. Expression of sequences coding for 4CL, DHS2, COMT1, SHM4, and a lipase thio/esterase positively correlated with that of PgMYB1 and PgMYB8, but negatively with that of PgMYB14 and PgMYB15. Complementary electrophoretic mobility shift assay (EMSA) and transactivation assay provided experimental evidence that these different R2R3-MYBs are able to bind similar AC cis-elements in the promoter region of Pg4CL and PgDHS2 genes but have opposite effects on their expression. Competitive binding EMSA experiments showed that PgMYB8 competes more strongly than PgMYB15 for the AC-I MYB binding site in the Pg4CL promoter. Together, the results bring a new perspective to the action of R2R3-MYB proteins in the regulation of distinct but interconnecting metabolism pathways.
Conifers; phenylpropanoid pathway; protein–DNA binding; R2R3-MYB evolution; transcriptional network.
The nuclear proto-oncogene c-myb is preferentially expressed in lymphohematopoietic cells, in which it plays an important role in the processes of differentiation and proliferation. The mechanism(s) that regulates c-myb expression is not fully understood, although in mouse cells a regulatory mechanism involves a transcriptional block in the first intron. To analyze the contribution of the 5' flanking sequences in regulating the expression of the human c-myb gene, we isolated a genomic clone containing extensive 5' flanking sequences, the first exon, and a large portion of the first intron. Sequence analysis of a subcloned 1.3-kb BamHI insert corresponding to 687 nucleotides of the 5' flanking sequence, the entire first exon, and 300 nucleotides of the first intron revealed the presence of closely spaced putative Myb binding sites within a segment extending from nucleotides -616 to -575 upstream from the cap site. A 165-bp segment containing these putative Myb binding sites was linked to a human thymidine kinase (TK) cDNA driven by a low-activity proliferating cell nuclear antigen promoter and cotransfected into TK- ts13 cells with a plasmid in which a full-length human c-myb cDNA is driven by the early simian virus 40 promoter; Myb inducibility of TK mRNA expression was observed both in transient expression assays and in stable transformants. The highest level of inducibility was detected when the 165-bp fragment was placed 138 bp upstream of the proliferating cell nuclear antigen promoter-TK cDNA reporter unit or 3' of the TK cDNA. Mutation of the putative Myb binding sites greatly reduced c-myb transactivation of TK mRNA expression and specifically reduced the binding of in vitro-translated Myb protein at those sites. Finally, c-myb transactivated TK mRNA expression driven by a segment of the authentic c-myb 5' flanking region containing the Myb binding sites. These data suggest that human c-myb maintains high levels of Myb protein in cells that require this gene product for proliferation and/or differentiation by an autoregulatory mechanism involving Myb binding sites in the 5' flanking region.
AtMYB44 is a member of the R2R3 MYB subgroup 22 transcription factors and regulates diverse cellular responses in Arabidopsis thaliana. We performed quadruple 9-mer-based protein binding microarray (PBM) analysis, which revealed that full-size AtMYB44 recognized and bound to the consensus sequence AACnG, where n represents A, G, C or T. The consensus sequence was confirmed by electrophoretic mobility shift assay (EMSA) with a truncated AtMYB44 protein containing the N-terminal side R2R3 domain. This result indicates that the R2R3 domain alone is sufficient to exhibit AtMYB44 binding specificity. The sequence AACnG is the type I binding site for MYB transcription factors, including all members of the subgroup 22. EMSA showed that the R2R3 domain protein binds in vitro to promoters of randomly selected Arabidopsis genes that contain the consensus binding sequence. This implies that AtMYB44 binds to any promoter region that contains the consensus sequence, without determining their functional activity or specificity. The C-terminal side transcriptional activation domain of AtMYB44 contains an asparagine-rich fragment, NINNTTSSRHNHNN (aa 215–228), which, among the members of subgroup 22, is unique to AtMYB44. A transcriptional activation assay in yeast showed that this fragment is included in a region (aa 200–240) critical for the ability of AtMYB44 to function as a transcriptional activator. We hypothesize that the C-terminal side of the protein, but not the N-terminal side of the R2R3 domain, contributes to the functional activity and specificity of AtMYB44 through interactions with other regulators generated by each of a variety of stimuli.
Arabidopsis; AtMYB44; protein binding microarray; protein domain; transcription factor
Transcriptional enhancers integrate the contributions of multiple classes of transcription factors (TFs) to orchestrate the myriad spatio-temporal gene expression programs that occur during development. A molecular understanding of enhancers with similar activities requires the identification of both their unique and their shared sequence features. To address this problem, we combined phylogenetic profiling with a DNA–based enhancer sequence classifier that analyzes the TF binding sites (TFBSs) governing the transcription of a co-expressed gene set. We first assembled a small number of enhancers that are active in Drosophila melanogaster muscle founder cells (FCs) and other mesodermal cell types. Using phylogenetic profiling, we increased the number of enhancers by incorporating orthologous but divergent sequences from other Drosophila species. Functional assays revealed that the diverged enhancer orthologs were active in largely similar patterns as their D. melanogaster counterparts, although there was extensive evolutionary shuffling of known TFBSs. We then built and trained a classifier using this enhancer set and identified additional related enhancers based on the presence or absence of known and putative TFBSs. Predicted FC enhancers were over-represented in proximity to known FC genes; and many of the TFBSs learned by the classifier were found to be critical for enhancer activity, including POU homeodomain, Myb, Ets, Forkhead, and T-box motifs. Empirical testing also revealed that the T-box TF encoded by org-1 is a previously uncharacterized regulator of muscle cell identity. Finally, we found extensive diversity in the composition of TFBSs within known FC enhancers, suggesting that motif combinatorics plays an essential role in the cellular specificity exhibited by such enhancers. In summary, machine learning combined with evolutionary sequence analysis is useful for recognizing novel TFBSs and for facilitating the identification of cognate TFs that coordinate cell type–specific developmental gene expression patterns.
The development of multicellular organisms requires the formation of a diversity of cell types. Each cell has a unique genetic program that is orchestrated by regulatory sequences called enhancers, comprising multiple short DNA sequences that bind distinct transcription factors. Understanding developmental regulatory networks requires knowledge of the sequence features of functionally related enhancers. We developed an integrated evolutionary and computational approach for deciphering enhancer regulatory codes and applied this method to discover new components of the transcriptional network controlling muscle development in the fruit fly, Drosophila melanogaster. Our method involves assembling known muscle enhancers, expanding this set with evolutionarily conserved sequences, computationally classifying these enhancers based on their shared sequence features, and scanning the entire Drosophila genome to predict additional related enhancers. Using this approach, we created a map of 5,500 putative muscle enhancers, identified candidate transcription factors to which they bind, observed a strong correlation between mapped enhancers and muscle gene expression, and uncovered extensive heterogeneity among combinations of transcription factor binding sites in validated muscle enhancers, a feature that may contribute to the individual cellular specificities of these regulatory elements. Our strategy can readily be generalized to study transcriptional networks in other organisms and developmental contexts.
Summary: ChIP-based technology is becoming the leading technology to globally profile thousands of transcription factors and elucidate the transcriptional regulation mechanisms in living cells. It has evolved rapidly in recent years, from hybridization with spotted or tiling microarray (ChIP-chip), to pair-end tag sequencing (ChIP-PET), to current massively parallel sequencing (ChIP-seq). Although there are many tools available for identifying binding sites (peaks) for ChIP-chip and ChIP-seq, few of them are available as easy-accessible online web tools for processing both ChIP-chip and ChIP-seq data for the ChIP-based user community. As such, we have developed a comprehensive web application tool for processing ChIP-chip and ChIP-seq data. Our web tool W-ChIPeaks employed a probe-based (or bin-based) enrichment threshold to define peaks and applied statistical methods to control false discovery rate for identified peaks. The web tool includes two different web interfaces: PELT for ChIP-chip, BELT for ChIP-seq, where both were tested on previously published experimental data. The novel features of our tool include a comprehensive output for identified peaks with GFF, BED, bedGraph and .wig formats, annotated genes to which these peaks are related, a graphical interpretation and visualization of the results via a user-friendly web interface.
Supplementary information: Supplementary data are available at Bioinformatics online.
The Myb family of transcription factors is defined by homology within the DNA binding domain and includes c-Myb, A-Myb, and B-Myb. The protein products of the myb genes all bind the Myb-binding site (MBS) [YG(A/G)C(A/C/G)GTT(G/A)]. A-myb has been found to display a limited pattern of expression. Here we report that bovine aortic smooth muscle cells (SMCs) express A-myb. Sequence analysis of isolated bovine A-myb cDNA clones spanning the entire coding region indicated extensive homology with the human gene, including the putative transactivation domain. Expression of A-myb was cell cycle dependent; levels of A-myb RNA increased in the late G1-to-S phase transition following serum stimulation of serum-deprived quiescent SMC cultures and peaked in S phase. Nuclear run-on analysis revealed that an increased rate of transcription can account for most of the increase in A-myb RNA levels. Treatment of SMC cultures with 5,6-dichlorobenzimidazole riboside, a selective inhibitor of RNA polymerase II, indicated an approximate 4-h half-life for A-myb mRNA during the S phase of the cell cycle. Expression of A-myb by SMCs was stimulated by basic fibroblast growth factor, in a cell density-dependent fashion. Cotransfection of a human A-myb expression vector activated a multimerized MBS element-driven reporter construct approximately 30-fold in SMCs. The activity of c-myb and c-myc promoters, which both contain multiple MBS elements, were similarly transactivated, approximately 30- and 50-fold, respectively, upon cotransfection with human A-myb. Lastly, A-myb RNA levels could be increased by a combination of phorbol ester plus insulin-like growth factor 1. To test the role of myb family members in progression through the cell cycle, we comicroinjected c-myc and myb expression vectors into serum-deprived quiescent SMCs. The combination of c-myc and either A-myb or c-myb but not B-myb synergistically led to entry into S phase, whereas microinjection of any vector alone had little effect on S phase entry. Thus, these results suggest that A-myb is a potent transactivator in bovine SMCs and that its expression induces progression into S phase of the cell cycle.
Transcription factor (TF) binding sites (cis element) play a central role in gene regulation, and eukaryotic organisms frequently adapt a combinatorial regulation to render sophisticated local gene expression patterns. Knowing the precise cis element on a distal promoter is a prerequisite for studying a typical transcription process; however, identifications of cis elements have lagged behind those of their associated trans acting TFs due to technical difficulties. Consequently, gene regulations via combinatorial TFs, as widely observed across biological processes, have remained vague in many cases.
We present here a valid strategy for identifying cis elements in combinatorial TF regulations. It consists of bioinformatic searches of available databases to generate candidate cis elements and tests of the candidates using improved experimental assays. Taking the MYB and the bHLH that collaboratively regulate the anthocyanin pathway genes as examples, we demonstrate how candidate cis motifs for the TFs are found on multi-specific promoters of chalcone synthase (CHS) genes, and how to experimentally test the candidate sites by designing DNA fragments hosting the candidate motifs based on a known promoter (us1 allele of Ipomoea purpurea CHS-D in our case) and applying site-mutagenesis at the motifs. It was shown that TF-DNA interactions could be unambiguously analyzed by assays of electrophoretic mobility shift (EMSA) and dual-luciferase transient expressions, and the resulting evidence precisely delineated a cis element. The cis element for R2R3 MYBs including Ipomoea MYB1 and Magnolia MYB1, for instance, was found to be ANCNACC, and that for bHLHs (exemplified by Ipomoea bHLH2 and petunia AN1) was CACNNG. A re-analysis was conducted on previously reported promoter segments recognized by maize C1 and apple MYB10, which indicated that cis elements similar to ANCNACC were indeed present on these segments, and tested positive for their bindings to Ipomoea MYB1.
Identification of cis elements in combinatorial regulation is now feasible with the strategy outlined. The working pipeline integrates the existing databases with experimental techniques, providing an open framework for precisely identifying cis elements. This strategy is widely applicable to various biological systems, and may enhance future analyses on gene regulation.
cis element; MYB; bHLH; EMSA; Dual-luciferase transient expression assay
The Arabidopsis AtMYB80 transcription factor regulates genes involved in pollen development and controls the timing of tapetal programmed cell death (PCD). Downregulation of AtMYB80 expression precedes tapetal degradation. Inhibition of AtMYB80 expression results in complete male sterility. Full-length AtMYB80 homologs have been isolated in wheat, rice, barley and canola (C genome).
The complete sequences of MYB80 genes from the Brassica. napus (A gene), B. juncea (A gene), B. oleracea (C gene) and the two orthologs from cotton (Gossypium hirsutum) were determined. The deduced amino acid sequences possess a highly conserved MYB domain, 44-amino acid region and 18-amino acid C-terminal sequence. The cotton MYB80 protein can fully restore fertility of the atmyb80 mutant, while removal of the 44 amino acid sequence abolishes its function. Two conserved MYB cis-elements in the AtMYB80 promoter are required for downregulation of MYB80 expression in anthers, apparently via negative auto-regulation. In cotton, tapetal degradation occurs at a slightly earlier stage of anther development than in Arabidopsis, consistent with an earlier increase and subsequent downregulation in GhMYB80 expression. The MYB80 homologs fused with the EAR repressor motif have been shown to induce male sterility in Arabidopsis. Constructs were designed to maximize the level of male sterility.
MYB80 genes are conserved in structure and function in all monocot and dicot species so far examined. Expression patterns of MYB80 in these species are also highly similar. The reversible male sterility system developed in Arabidopsis by manipulating MYB80 expression should be applicable to all major crops.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0278-3) contains supplementary material, which is available to authorized users.
Brassica; Cotton; Gossypium hirsutum; Male sterility; MYB80; Transcription factor
c-Myb is expressed at high levels in immature progenitors of all the hematopoietic lineages. It is associated with the regulation of proliferation, differentiation and survival of erythroid, myeloid and lymphoid cells, but decreases during the terminal differentiation to mature blood cells. The cellular level of c-Myb is controlled by not only transcriptional regulation but also ubiquitin-dependent proteolysis. We recently reported that mouse c-Myb protein is controlled by ubiquitin-dependent degradation by SCF-Fbw7 E3 ligase via glycogen synthase kinase 3 (GSK3)-mediated phosphorylation of Thr-572 in a Cdc4 phosphodegron (CPD)-dependent manner. However, this critical threonine residue is not conserved in human c-Myb. In this study, we investigated whether GSK3 is involved in the regulatory mechanism for human c-Myb expression.
Human c-Myb was degraded by ubiquitin-dependent degradation via SCF-Fbw7. Human Fbw7 ubiquitylated not only human c-Myb but also mouse c-Myb, whereas mouse Fbw7 ubiquitylated mouse c-Myb but not human c-Myb. Human Fbw7 mutants with mutations of arginine residues important for recognition of the CPD still ubiquitylated human c-Myb. These data strongly suggest that human Fbw7 ubiquitylates human c-Myb in a CPD-independent manner. Mutations of the putative GSK3 phosphorylation sites in human c-Myb did not affect the Fbw7-dependent ubiquitylation of human c-Myb. Neither chemical inhibitors nor a siRNA for GSK3β affected the stability of human c-Myb. However, depletion of GSK3β upregulated the transcription of human c-Myb, resulting in transcriptional suppression of γ-globin, one of the c-Myb target genes.
The present observations suggest that human Fbw7 ubiquitylates human c-Myb in a CPD-independent manner, whereas mouse Fbw7 ubiquitylates human c-Myb in a CPD-dependent manner. Moreover, GSK3 negatively regulates the transcriptional expression of human c-Myb but does not promote Fbw7-dependent degradation of human c-Myb protein. Inactivation of GSK3 as well as mutations of Fbw7 may be causes of the enhanced c-Myb expression observed in leukemia cells. We conclude that expression levels of human and mouse c-Myb are regulated via different mechanisms.
The retroviral oncogene v-myb encodes a transcription factor (v-Myb) which transforms myelomonocytic cells in vivo and in vitro. It is thought that v-Myb exerts its biological effects by deregulating the expression of specific target genes, most of which are still unknown. The chicken glioma-amplified sequence 41 gene (GAS41) is located immediately downstream of the lysozyme gene, a known Myb-regulated gene. The GAS41 promoter colocalizes with a CpG island which also functions as an origin of replication. Since the GAS41 promoter contains several potential Myb-binding sites (MBSs) we have investigated whether GAS41 is a v-Myb target gene. Our results show that the GAS41 gene is directly activated by a v-Myb/estrogen receptor fusion protein. Furthermore, our studies reveal that the GAS41 promoter is stimulated by v-Myb in co-transfection experiments and that the DNA-binding activity of v-Myb is crucial for transactivation of the promoter. Electrophoretic mobility-shift assays (EMSA) indicate that several Myb-binding sites, residing ∼250 bp upstream of the transcriptional start site, are bound by Myb in vitro. Furthermore, chromatin immunoprecipitation assays demonstrate that v-Myb is bound to the GAS41 promoter in vivo. Taken together these findings identify the GAS41 gene as a novel v-Myb target gene. We have also analysed the GAS41 replication origin in myelomonocytic cells and have failed to observe significant differences in origin activity in cells expressing or not expressing v-Myb.