Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near the TSS, RMs can co-localize TFBSs with each other and the TSS. The proportion of TFBS positional preferences due to TFBS co-localization within RMs is unknown, however. ChIP experiments confirm co-localization of some TFBSs genome-wide, including near the TSS, but they typically examine only a few TFs at a time, using non-physiological conditions that can vary from lab to lab. In contrast, sequence analysis can examine many TFs uniformly and methodically, broadly surveying the co-localization of TFBSs with tight positional preferences relative to the TSS.
Our statistics found 43 significant sets of human motifs in the JASPAR TF Database with positional preferences relative to the TSS, with 38 preferences tight (±5 bp). Each set of motifs corresponded to a gene group of 135 to 3304 genes, with 42/43 (98%) gene groups independently validated by DAVID, a gene ontology database, with FDR < 0.05. Motifs corresponding to two TFBSs in a RM should co-occur more than by chance alone, enriching the intersection of the gene groups corresponding to the two TFs. Thus, a gene-group intersection systematically enriched beyond chance alone provides evidence that the two TFs participate in an RM. Of the 903 = 43*42/2 intersections of the 43 significant gene groups, we found 768/903 (85%) pairs of gene groups with significantly enriched intersections, with 564/768 (73%) intersections independently validated by DAVID with FDR < 0.05. A user-friendly web site at http://go.usa.gov/3kjsH permits biologists to explore the interaction network of our TFBSs to identify candidate subunit RMs.
Gene duplication and convergent evolution within a genome provide obvious biological mechanisms for replicating an RM near the TSS that binds a particular TF subunit. Of all intersections of our 43 significant gene groups, 85% were significantly enriched, with 73% of the significant enrichments independently validated by gene ontology. The co-localization of TFBSs within RMs therefore likely explains much of the tight TFBS positional preferences near the TSS.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-016-1354-5) contains supplementary material, which is available to authorized users.
Transcription factor binding site; Positional preference; Transcription start site
The Danish Urogynaecological Database is established in order to ensure high quality of treatment for patients undergoing urogynecological surgery. The database contains details of all women in Denmark undergoing incontinence surgery or pelvic organ prolapse surgery amounting to ~5,200 procedures per year. The variables are collected along the course of treatment of the patient from the referral to a postoperative control. Main variables are prior obstetrical and gynecological history, symptoms, symptom-related quality of life, objective urogynecological findings, type of operation, complications if relevant, implants used if relevant, 3–6-month postoperative recording of symptoms, if any. A set of clinical quality indicators is being maintained by the steering committee for the database and is published in an annual report which also contains extensive descriptive statistics. The database has a completeness of over 90% of all urogynecological surgeries performed in Denmark. Some of the main variables have been validated using medical records as gold standard. The positive predictive value was above 90%. The data are used as a quality monitoring tool by the hospitals and in a number of scientific studies of specific urogynecological topics, broader epidemiological topics, and the use of patient reported outcome measures.
urogynecology; pelvic organ prolapse surgery; incontinence surgery; surgical quality monitoring
The number of new technologies for risk assessment available in health care is increasing. These technologies are intended to contribute to both improved care practices and improved patient outcomes. To do so however, there is a need to study how new technologies are understood and interpreted by users in clinical practice. The objective of this study was to explore patient and physician perspectives on the usefulness of a new technology to detect Cardiovascular Autonomic Neuropathy (CAN) in a specialist diabetes clinic. The technology is a handheld device that measures resting heart rate and conducts three cardiac autonomic reflex tests to evaluate heart rate variability.
The study relied on three sources of data: observations of medical consultations where results of the CAN test were reported (n = 8); interviews with patients who had received the CAN test (n = 19); and interviews with physicians who reported results of the CAN test (n = 9). Data were collected at the specialist diabetes clinic between November 2013 and January 2014. Data were analysed using the concept of technological frames which is used to assess how physicians and patients understand and interpret the new technology.
Physicians generally found it difficult to communicate test results to patients in terms that patients could understand and to translate results into meaningful implications for the treatment of patients. Results of the study indicate that patients did not recall having done the CAN test nor recall receiving the results. Furthermore, patients were generally unsure about the purpose of the CAN test and the implications of the results.
Involving patients and physicians is essential when a new technology is introduced in clinical practice. This particularly includes the interpretation and communication processes related to its use.
The integration of a new risk assessment technology into clinical practice can be accompanied by several challenges. It is suggested that more information about the CAN test be provided to patients and that a dialogue-based approach be used when communicating test results to patients in order to best support the use of the technology in clinical practice.
Users’ experiences; Technology; Risk assessment; Diabetes; Cardiac autonomic neuropathy
Hepatocellular carcinoma (HCC) is a lethal malignancy with high mortality and poor prognosis. Oncogenic transcription factor Late SV40 Factor (LSF) plays an important role in promoting HCC. A small molecule inhibitor of LSF, Factor Quinolinone Inhibitor 1 (FQI1), significantly inhibited human HCC xenografts in nude mice without harming normal cells. Here we evaluated the efficacy of FQI1 and another inhibitor, FQI2, in inhibiting endogenous hepatocarcinogenesis. HCC was induced in a transgenic mouse with hepatocyte-specific overexpression of c-myc (Alb/c-myc) by injecting N-nitrosodiethylamine (DEN) followed by FQI1 or FQI2 treatment after tumor development. LSF inhibitors markedly decreased tumor burden in Alb/c-myc mice with a corresponding decrease in proliferation and angiogenesis. Interestingly, in vitro treatment of human HCC cells with LSF inhibitors resulted in mitotic arrest with an accompanying increase in CyclinB1. Inhibition of CyclinB1 induction by Cycloheximide or CDK1 activity by Roscovitine significantly prevented FQI-induced mitotic arrest. A significant induction of apoptosis was also observed upon treatment with FQI. These effects of LSF inhibition, mitotic arrest and induction of apoptosis by FQI1s provide multiple avenues by which these inhibitors eliminate HCC cells. LSF inhibitors might be highly potent and effective therapeutics for HCC either alone or in combination with currently existing therapies.
LSF; HCC; FQI; mitotic arrest; apoptosis
To explore the feasibility of a research-based program for patient-centered consultations to improve medical adherence and blood glucose control in patients with type 2 diabetes.
Patients and methods
The patient-centered empowerment, motivation, and medical adherence (EMMA) consultation program consisted of three individual consultations and one phone call with a single health care professional (HCP). Nineteen patients with type 2 diabetes completed the feasibility study. Feasibility was assessed by a questionnaire-based interview with patients 2 months after the final consultation and interviews with HCPs. Patient participation was measured by 10-second event coding based on digital recordings and observations of the consultations.
HCPs reported that EMMA supported patient-centered consultations by facilitating dialogue, reflection, and patient activity. Patients reported that they experienced valuable learning during the consultations, felt understood, and listened to and felt a trusting relationship with HCPs. Consultations became more person-specific, which helped patients and HCPs to discover inadequate diabetes self-management through shared decision-making. Compared with routine consultations, HCPs talked less and patients talked more. Seven of ten dialogue tools were used by all patients. It was difficult to complete the EMMA consultations within the scheduled time.
The EMMA program was feasible, usable, and acceptable to patients and HCPs. The use of tools elicited patients’ perspectives and facilitated patient participation and shared decision-making.
type 2 diabetes; adherence; participation; dialogue; health education; self-management
Clinical trial and epidemiological data support that the cardiovascular effects of estrogen are complex, including a mixture of both potentially beneficial and harmful effects. In animal models, estrogen protects females from vascular injury and inhibits atherosclerosis. These effects are mediated by estrogen receptors (ERs), which when bound to estrogen can bind to DNA to directly regulate transcription. ERs can also activate several cellular kinases by inducing a “rapid” non-nuclear signaling cascade. However, the biologic significance of this rapid signaling pathway has been unclear.
Methods and Results
Here, we develop a novel transgenic mouse in which rapid signaling is blocked by over-expression of a peptide that prevents ERs from interacting with the scaffold protein, striatin (the Disrupting Peptide Mouse, DPM). Microarray analysis of ex vivo-treated mouse aortas demonstrates that rapid ER signaling plays an important role in estrogen-mediated gene regulatory responses. Disruption of ER-striatin interactions also eliminates the ability of estrogen to stimulate cultured endothelial cell migration and to inhibit cultured vascular smooth muscle cell growth. The importance of these findings is underscored by in vivo experiments demonstrating loss of estrogen-mediated protection against vascular injury in the DPM mouse following carotid artery wire injury.
Taken together, these results support that rapid, non-nuclear ER signaling contributes to the transcriptional regulatory functions of ER, and is essential for many of the vasoprotective effects of estrogen. These findings also identify the rapid ER signaling pathway as a potential target for the development of novel therapeutic agents.
cardiovascular diseases; hormones; molecular biology; signal transduction
Upon stimulation of mature B cells, class switch recombination (CSR) can alter the specific immunoglobulin heavy chain constant region that is expressed. In a tissue culture cell line, we previously demonstrated that inhibition of late SV40 factor (LSF) family members enhanced IgM to IgA CSR. Here, isotype specificity of CSR regulation by LSF family members is addressed in primary mouse splenic B cells. First, we demonstrate that LBP-1a is the prevalent family member in B lymphocytes. Second, we demonstrate by ChIP that LBP-1a binds genomic sequences around mouse switch regions (S) in an isotype-specific manner, in accordance with computational predictions: binding is observed to Sμ and Sα, but not to the tested Sγ1, regions. Importantly, binding of LBP-1a is tightly regulated, with occupancy at genomic S regions dramatically decreasing following LPS stimulation. Finally, the consequence of DNA-binding by LBP-1a is determined using bone marrow chimeric mice in which LSF/LBP-1 activity is inhibited in hematopoietic lineages. Upon in vitro stimulation of such primary B-cells, CSR occurs with a higher efficiency to IgA, but not to IgG1. These results are supportive of a model whereby LBP-1a represses CSR in an isotype-specific manner via direct interaction with switch regions involved in the recombination.
B cells; Immunoglobulins; Molecular Biology; Recombinant Viral Vectors
LSF is a mammalian transcription factor that is rapidly and quantitatively phosphorylated upon growth induction of resting, peripheral human T cells, as assayed by a reduction in its electrophoretic mobility. The DNA-binding activity of LSF in primary T cells is greatly increased after this phosphorylation event [Volker et al., 1997]. We demonstrate here that LSF is also rapidly and quantitatively phosphorylated upon growth induction in NIH 3T3 cells, although its DNA-binding activity is not significantly altered. Three lines of experimentation established that ERK is responsible for phosphorylating LSF upon growth induction in both cell types. First, phosphorylation of LSF by ERK is sufficient to cause the reduced electrophoretic mobility of LSF. Second, the amount of ERK activity correlates with the extent of LSF phosphorylation in both primary human T cells and NIH 3T3 cells. Finally, specific inhibitors of the Ras/Raf/MEK/ERK pathway inhibit LSF modification in vivo. This phosphorylation by ERK is not sufficient for activation of LSF DNA-binding activity, as evidenced both in vitro and in mouse fibroblasts. Nonetheless, activation of ERK is a prerequisite for the substantial increase in LSF DNA-binding activity upon activation of resting T cells, indicating that ERK phosphorylation is necessary but not sufficient for activation of LSF in this cell type.
ERK; LSF; T cells; fibroblasts; DNA-binding; phosphorylation
Transcriptional regulation in mammalian cells is driven by a complex interplay of multiple transcription factors that respond to signals from either external or internal stimuli. A single transcription factor can control expression of distinct sets of target genes, dependent on its state of post-translational modifications, interacting partner proteins, and the chromatin environment of the cellular genome. Furthermore, many transcription factors can act as either transcriptional repressors or activators, depending on promoter and cellular contexts (Alvarez, et al., 2003). Even in this light, the versatility of LSF (Late SV40 Factor) is remarkable. A hallmark of LSF is its unusual DNA binding domain, as evidenced both by lack of homology to any other established DNA-binding domains and by its DNA recognition sequence. Although a dimer in solution, LSF requires additional multimerization with itself or partner proteins in order to interact with DNA. Transcriptionally, LSF can function as an activator or a repressor. It is a direct target of an increasing number of signal transduction pathways. Biologically, LSF plays roles in cell cycle progression and cell survival, as well as in cell lineage-specific functions, shown most strikingly to date in hematopoietic lineages.
This review discusses how the unique aspects of LSF DNA-binding activity may make it particularly susceptible to regulation by signal transduction pathways and may relate to its distinct biological roles. We present current progress in elucidation of both tissue-specific and more universal cellular roles of LSF. Finally, we discuss suggestive data linking LSF to signaling by the amyloid precursor protein and to Alzheimer's disease, as well as to the regulation of latency of the human immunodeficiency virus (HIV).
GRH; DNA-binding; signal transduction; cell cycle progression; immune response; APP; HIV
The transcription factor LSF (Late SV40 Factor), also known as TFCP2, belongs to the LSF/CP2 family related to Grainyhead family of proteins and is involved in many biological events, including regulation of cellular and viral promoters, cell cycle, DNA synthesis, cell survival and Alzheimer’s disease. Our recent studies establish an oncogenic role of LSF in Hepatocellular carcinoma (HCC). LSF overexpression is detected in human HCC cell lines and in more than 90% cases of human HCC patients, compared to normal hepatocytes and liver, and its expression level showed significant correlation with the stages and grades of the disease. Forced overexpression of LSF in less aggressive HCC cells resulted in highly aggressive, angiogenic and multi-organ metastatic tumors in nude mice. Conversely, inhibition of LSF significantly abrogated growth and metastasis of highly aggressive HCC cells in nude mice. Microarray studies revealed that as a transcription factor LSF modulated specific genes regulating invasion, angiogenesis, chemoresistance and senescence. LSF transcriptionally regulates thymidylate synthase (TS) gene, thus contributing to cell cycle regulation and chemoresistance. Our studies identify a network of proteins, including osteopontin (OPN), Matrix metalloproteinase-9 (MMP-9), c-Met and complement factor H (CFH), that are directly regulated by LSF and play important role in LSF-induced hepatocarcinogenesis. A high throughput screening identified small molecule inhibitors of LSF DNA binding and the prototype of these molecules, Factor Quinolinone inhibitor 1 (FQI1), profoundly inhibited cell viability and induced apoptosis in human HCC cells without exerting harmful effects to normal immortal human hepatocytes and primary mouse hepatocytes. In nude mice xenograft studies, FQI1 markedly inhibited growth of human HCC xenografts as well as angiogenesis without exerting any toxicity. These studies establish a key role of LSF in hepatocarcinogenesis and usher in a novel therapeutic avenue for HCC, an invariably fatal disease.
Late SV40 Factor (LSF); hepatocellular carcinoma (HCC); osteopontin (OPN); matrix metalloproteinase-9 (MMP-9); c-Met; thymidylate synthase (TS); angiogenesis; metastasis; cell cycle regulation; small molecule inhibitors; FQI1
Many platforms for genome-wide analysis of gene expression contain ‘redundant’ measures for the same gene. For example, the most highly utilized platforms for gene expression microarrays, Affymetrix GeneChip® arrays, have as many as ten or more probe sets for some genes. Occasionally, individual probe sets for the same gene report different trends in expression across experimental conditions, a situation that must be resolved in order to accurately interpret the data. We developed an algorithm, SCOREM, for determining the level of agreement between such probe sets, utilizing a statistical test of concordance, Kendall's W coefficient of concordance, and a graph-searching algorithm for the identification of concordant probe sets. We also present methods for consolidating concordant groups into a single value for its corresponding gene and for post hoc analysis of discordant groups. By combining statistical consolidation with sequence analysis, SCOREM possesses the unique ability to identify biologically meaningful discordant behaviors, including differing behaviors in alternate RNA isoforms and tissue-specific patterns of expression. When consolidating concordant behaviors, SCOREM outperforms other methods in detecting both differential expression and overrepresented functional categories.
HMGN; chromatin; transcription; transcription factors; chromatin remodeling; protein modifications
Approximately 25% of all patients with stage II colorectal cancer will experience recurrent disease and subsequently die within 5 years. MicroRNA-21 (miR-21) is upregulated in several cancer types and has been associated with survival in colon cancer. In the present study we developed a robust in situ hybridization assay using high-affinity Locked Nucleic Acid (LNA) probes that specifically detect miR-21 in formalin-fixed paraffin embedded (FFPE) tissue samples. The expression of miR-21 was analyzed by in situ hybridization on 130 stage II colon and 67 stage II rectal cancer specimens. The miR-21 signal was revealed as a blue chromogenic reaction, predominantly observed in fibroblast-like cells located in the stromal compartment of the tumors. The expression levels were measured using image analysis. The miR-21 signal was determined as the total blue area (TB), or the area fraction relative to the nuclear density (TBR) obtained using a red nuclear stain. High TBR (and TB) estimates of miR-21 expression correlated significantly with shorter disease-free survival (p = 0.004, HR = 1.28, 95% CI: 1.06–1.55) in the stage II colon cancer patient group, whereas no significant correlation with disease-free survival was observed in the stage II rectal cancer group. In multivariate analysis both TB and TBR estimates were independent of other clinical parameters (age, gender, total leukocyte count, K-RAS mutational status and MSI). We conclude that miR-21 is primarily a stromal microRNA, which when measured by image analysis identifies a subgroup of stage II colon cancer patients with short disease-free survival.
Electronic supplementary material
The online version of this article (doi:10.1007/s10585-010-9355-7) contains supplementary material, which is available to authorized users.
MicroRNA; MiR-21; Colorectal cancer; In situ hybridization; LNA
Cell cycle progression in mammalian cells from G1 into S phase requires sensing and integration of multiple inputs, in order to determine whether to continue to cellular DNA replication and subsequently, to cell division. Passage to S requires transition through the restriction point, which at a molecular level consists of a bistable switch involving E2Fs and pRb family members. At the G1/S boundary, a number of genes essential for DNA replication and cell cycle progression are upregulated, promoting entry into S phase. Although the activating E2Fs are the most extensively characterized transcription factors driving G1/S expression, LSF is also a transcription factor essential for stimulating G1/S gene expression. A critical LSF target gene at this stage, Tyms, encodes thymidylate synthetase. In investigating how LSF is activated in a cell cycle-dependent manner, we recently identified a novel time delay mechanism for regulating its activity during G1 progression, which is apparently independent of the E2F/pRb axis. This involves inhibition of LSF in early G1 by two major proliferative signaling pathways: ERK and cyclin C/CDK, followed by gradual dephosphorylation during mid- to late-G1. Whether LSF and E2F act independently or in concert to promote G1/S progression remains to be determined.
LSF; cyclin C/CDK; ERK; thymidylate synthetase; E2F; pRb; p53; G1 phase; S phase; restriction point
The transcription factors of the LSF/Grainyhead (GRH) family are characterized by the possession of a distinctive DNA-binding domain that bears no clear relationship to other known DNA-binding domains, with the possible exception of the p53 core domain. In triploblastic animals, the LSF and GRH subfamilies have diverged extensively with respect to their biological roles, general expression patterns, and mechanism of DNA binding. For example, Grainyhead (GRH) homologs are expressed primarily in the epidermis, and they appear to play an ancient role in maintaining the epidermal barrier. By contrast, LSF homologs are more widely expressed, and they regulate general cellular functions such as cell cycle progression and survival in addition to cell-lineage specific gene expression.
To illuminate the early evolution of this family and reconstruct the functional divergence of LSF and GRH, we compared homologs from 18 phylogenetically diverse taxa, including four basal animals (Nematostella vectensis, Vallicula multiformis, Trichoplax adhaerens, and Amphimedon queenslandica), a choanoflagellate (Monosiga brevicollis) and several fungi. Phylogenetic and bioinformatic analyses of these sequences indicate that (1) the LSF/GRH gene family originated prior to the animal-fungal divergence, and (2) the functional diversification of the LSF and GRH subfamilies occurred prior to the divergence between sponges and eumetazoans. Aspects of the domain architecture of LSF/GRH proteins are well conserved between fungi, choanoflagellates, and metazoans, though within the Metazoa, the LSF and GRH families are clearly distinct. We failed to identify a convincing LSF/GRH homolog in the sequenced genomes of the algae Volvox carteri and Chlamydomonas reinhardtii or the amoebozoan Dictyostelium purpureum. Interestingly, the ancestral GRH locus has become split into two separate loci in the sea anemone Nematostella, with one locus encoding a DNA binding domain and the other locus encoding the dimerization domain.
In metazoans, LSF and GRH proteins play a number of roles that are essential to achieving and maintaining multicellularity. It is now clear that this protein family already existed in the unicellular ancestor of animals, choanoflagellates, and fungi. However, the diversification of distinct LSF and GRH subfamilies appears to be a metazoan invention. Given the conserved role of GRH in maintaining epithelial integrity in vertebrates, insects, and nematodes, it is noteworthy that the evolutionary origin of Grh appears roughly coincident with the evolutionary origin of the epithelium.
Transcription factor LSF is required for progression from quiescence through the cell cycle, regulating thymidylate synthase (Tyms) expression at the G1/S boundary. Given the constant level of LSF protein from G0 through S, we investigated whether LSF is regulated by phosphorylation in G1. In vitro, LSF is phosphorylated by cyclin E/cyclin-dependent kinase 2 (CDK2), cyclin C/CDK2, and cyclin C/CDK3, predominantly on S309. Phosphorylation of LSF on S309 is maximal 1 to 2 h after mitogenic stimulation of quiescent mouse fibroblasts. This phosphorylation is mediated by cyclin C-dependent kinases, as shown by coimmunoprecipitation of LSF and cyclin C in early G1 and by abrogation of LSF S309 phosphorylation upon suppression of cyclin C with short interfering RNA. Although mouse fibroblasts lack functional CDK3 (the partner of cyclin C in early G1 in human cells), CDK2 compensates for this absence. By transient transfection assays, phosphorylation at S309, mediated by cyclin C overexpression, inhibits LSF transactivation. Moreover, overexpression of cyclin C and CDK3 inhibits induction of endogenous Tyms expression at the G1/S transition. These results identify LSF as only the second known target (in addition to pRb) of cyclin C/CDK activity during progression from quiescence to early G1. Unexpectedly, this phosphorylation prevents induction of LSF target genes until late G1.
HMGN1, an abundant nucleosomal binding protein, can affect both the chromatin higher order structure and the modification of nucleosomal histones, but it alters the expression of only a subset of genes. We investigated specific gene targeting by HMGN1 in the context of estrogen induction of gene expression. Knockdown and overexpression experiments indicated that HMGN1 limits the induction of several estrogen-regulated genes, including TFF1 and FOS, which are induced by estrogen through entirely distinct mechanisms. HMGN1 specifically interacts with estrogen receptor α (ERα), both in vitro and in vivo. At the TFF1 promoter, estrogen increases HMGN1 association through recruitment by the ERα. HMGN1 S20E/S24E, although deficient in binding nucleosomal DNA, still interacts with ERα and, strikingly, still represses estrogen-driven activation of the TFF1 gene. On the FOS promoter, which lacks the ERα binding sites, constitutively bound serum response factor (SRF) mediates estrogen stimulation. HMGN1 also interacts specifically with SRF, but HMGN1 S20E/S24E does not. Consistent with the protein interactions, only wild-type HMGN1 significantly inhibits the estrogen-driven activation of the FOS gene. Mechanistically, the inhibition of estrogen induction of several ERα-associated genes, including TFF1, by HMGN1 correlates with decreased levels of acetylation of Lys9 on histone H3. Together, these findings indicate that HMGN1 regulates the expression of particular genes via specific protein-protein interactions with transcription factors at target gene regulatory regions.
Human immunodeficiency virus type 1 (HIV-1) establishes a persistent, nonproductive state within a small population of memory CD4+ cells. The transcription factor LSF binds to sequences within the HIV-1 long terminal repeat (LTR) initiation region and recruits a second factor, YY1, to the LTR. These factors then cooperatively recruit histone deacetylase 1 to the LTR, resulting in inhibition of transcription. This appears to be one mechanism contributing to HIV persistence within resting CD4+ T cells. We sought to further detail LSF binding to the HIV-1 LTR and factors that regulate LSF occupancy. We find that LSF binds the LTR as a tetramer and that binding is regulated by phosphorylation mediated by mitogen-activated protein kinases (MAPKs). In vitro, phosphorylation of LSF by Erk decreases binding to the LTR, while binding is increased by p38 phosphorylation. LSF occupancy at LTR chromatin is increased by the p38 agonist anisomycin and decreased by specific p38 inhibition. p38 inhibition also results in increased acetylation of histone H4 at the LTR nucleosome adjacent to the LSF binding site. p38 inhibition also blocked the ability of YY1 to inhibit activation of the integrated HIV promoter. Finally, HIV was recovered from the resting CD4+ T cells of aviremic, HIV-infected donors upon treatment of these cells with specific inhibitor of p38. These data suggest that the MAPK pathway regulates LSF binding to the LTR and thereby one aspect of the regulation of HIV expression. This mechanism could be exploited as a novel therapeutic target to disrupt latent HIV infection.
The interaction of proteins with DNA recognition motifs regulates a number of fundamental biological processes, including transcription. To understand these processes, we need to know which motifs are present in a sequence and which factors bind to them. We describe a method to screen a set of DNA sequences against a precompiled library of motifs, and assess which, if any, of the motifs are statistically over- or under-represented in the sequences. Over-represented motifs are good candidates for playing a functional role in the sequences, while under-representation hints that if the motif were present, it would have a harmful dysregulatory effect. We apply our method (implemented as a computer program called Clover) to dopamine-responsive promoters, sequences flanking binding sites for the transcription factor LSF, sequences that direct transcription in muscle and liver, and Drosophila segmentation enhancers. In each case Clover successfully detects motifs known to function in the sequences, and intriguing and testable hypotheses are made concerning additional motifs. Clover compares favorably with an ab initio motif discovery algorithm based on sequence alignment, when the motif library includes only a homolog of the factor that actually regulates the sequences. It also demonstrates superior performance over two contingency table based over-representation methods. In conclusion, Clover has the potential to greatly accelerate characterization of signals that regulate transcription.
We have developed a computational method for transcriptional regulatory network inference, CARRIE (Computational Ascertainment of Regu latory Relationships Inferred from Expression), which combines microarray and promoter sequence analysis. CARRIE uses sources of data to identify the transcription factors (TFs) that regulate gene expression changes in response to a stimulus and generates testable hypotheses about the regulatory network connecting these TFs to the genes they regulate. The promoter analysis component of CARRIE, ROVER (Relative OVER-abundance of cis-elements), is highly accurate at detecting the TFs that regulate the response to a stimulus. ROVER also predicts which genes are regulated by each of these TFs. CARRIE uses these transcriptional interactions to infer a regulatory network. To demonstrate our method, we applied CARRIE to six sets of publicly available DNA microarray experiments on Saccharomyces cerevisiae. The predicted networks were validated with comparisons to literature sources, experimental TF binding data, and gene ontology biological process information.
Algorithms that detect and align locally similar regions of biological sequences have the potential to discover a wide variety of functional motifs. Two theoretical contributions to this classic but unsolved problem are presented here: a method to determine the width of the aligned motif automatically; and a technique for calculating the statistical significance of alignments, i.e. an assessment of whether the alignments are stronger than those that would be expected to occur by chance among random, unrelated sequences. Upon exploring variants of the standard Gibbs sampling technique to optimize the alignment, we discovered that simulated annealing approaches perform more efficiently. Finally, we conduct failure tests by applying the algorithm to increasingly difficult test cases, and analyze the manner of and reasons for eventual failure. Detection of transcription factor-binding motifs is limited by the motifs’ intrinsic subtlety rather than by inadequacy of the alignment optimization procedure.
The LSF/Grainyhead transcription factor family is involved in many important biological processes, including cell cycle, cell growth and development. In order to investigate the evolutionary conservation of these biological roles, we have characterized two new family members in Caenorhabditis elegans and Xenopus laevis. The C.elegans member, Ce-GRH-1, groups with the Grainyhead subfamily, while the X.laevis member, Xl-LSF, groups with the LSF subfamily. Ce-GRH-1 binds DNA in a sequence-specific manner identical to that of Drosophila melanogaster Grainyhead. In addition, Ce-GRH-1 binds to sequences upstream of the C.elegans gene encoding aromatic l-amino-acid decarboxylase and genes involved in post-embryonic development, mab-5 and dbl-1. All three C.elegans genes are homologs of D.melanogaster Grainyhead-regulated genes. RNA-mediated interference of Ce-grh-1 results in embryonic lethality in worms, accompanied by soft, defective cuticles. These phenotypes are strikingly similar to those observed previously in D.melanogaster grainyhead mutants, suggesting conservation of the developmental role of these family members over the course of evolution. Our phylogenetic analysis of the expanded LSF/GRH family (including other previously unrecognized proteins/ESTs) suggests that the structural and functional dichotomy of this family dates back more than 700 million years, i.e. to the time when the first multicellular organisms are thought to have arisen.
The human genome encodes the transcriptional control of its genes in clusters of cis-elements that constitute enhancers, silencers and promoter signals. The sequence motifs of individual cis- elements are usually too short and degenerate for confident detection. In most cases, the requirements for organization of cis-elements within these clusters are poorly understood. Therefore, we have developed a general method to detect local concentrations of cis-element motifs, using predetermined matrix representations of the cis-elements, and calculate the statistical significance of these motif clusters. The statistical significance calculation is highly accurate not only for idealized, pseudorandom DNA, but also for real human DNA. We use our method ‘cluster of motifs E-value tool’ (COMET) to make novel predictions concerning the regulation of genes by transcription factors associated with muscle. COMET performs comparably with two alternative state-of-the-art techniques, which are more complex and lack E-value calculations. Our statistical method enables us to clarify the major bottleneck in the hard problem of detecting cis-regulatory regions, which is that many known enhancers do not contain very significant clusters of the motif types that we search for. Thus, discovery of additional signals that belong to these regulatory regions will be the key to future progress.
The TATA sequence of the human, estrogen-responsive pS2 promoter is complexed in vivo with a rotationally and translationally positioned nucleosome (NUC T). Using a chromatin immunoprecipitation assay, we demonstrate that TATA binding protein (TBP) does not detectably interact with this genomic binding site in MCF-7 cells in the absence of transcriptional stimuli. Estrogen stimulation of these cells results in hyperacetylation of both histones H3 and H4 within the pS2 chromatin encompassing NUC T and the TATA sequence. Concurrently, TBP becomes associated with the pS2 promoter region. The relationship between histone hyperacetylation and the binding of TBP was assayed in vitro using an in vivo-assembled nucleosomal array over the pS2 promoter. With chromatin in its basal state, the binding of TBP to the pS2 TATA sequence at the edge of NUC T was severely restricted, consistent with our in vivo data. Acetylation of the core histones facilitated the binding of TBP to this nucleosomal TATA sequence. Therefore, we demonstrate that one specific, functional consequence of induced histone acetylation at a native promoter is the alleviation of nucleosome-mediated repression of the binding of TBP. Our data support a fundamental role for histone acetylation at genomic promoters in transcriptional activation by nuclear receptors and provide a general mechanism for rapid and reversible transcriptional activation from a chromatin template.