Determining how facultative anaerobic organisms sense and direct cellular responses to electron acceptor availability has been a subject of intense study. However, even in the model organism Escherichia coli, established mechanisms only explain a small fraction of the hundreds of genes that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs), ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome-scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes by ArcA and extensive activation of chemiosmotic genes by Fnr. We further corroborated this regulatory scheme by showing a 0.71 r2 (p<1e-6) correlation between changes in metabolic flux and changes in regulatory activity across fermentative and nitrate respiratory conditions. Finally, we are able to relate the proposed model to a wealth of previously generated data by contextualizing the existing transcriptional regulatory network.
All heterotrophic organisms must balance the deployment of consumed carbon compounds between growth and the generation of energy. These two competing objectives have been shown, both computationally and experimentally, to exist as the principal dimensions of the function of metabolic networks. Each of these dimensions can also be thought of as the familiar metabolic functions of catabolism, anabolism, and generation of energy. Here we detail how two global transcription factors (TFs), ArcA and Fnr of Escherichia coli that sense redox ratios, act on a genome-wide basis to coordinately regulate these global metabolic functions through transcriptional control of enzyme and transporter levels in changing environments. A model results from the study that shows how global transcription factors regulate global dimensions of metabolism and form a regulatory hierarchy that reflects the structural hierarchy of the metabolic network.
Gene targeting in human somatic cells is of importance because it can be used to either delineate the loss-of-function phenotype of a gene or correct a mutated gene back to wild-type. Both of these outcomes require a form of DNA double-strand break (DSB) repair known as homologous recombination (HR). The mechanism of HR leading to gene targeting, however, is not well understood in human cells. Here, we demonstrate that a two-end, ends-out HR intermediate is valid for human gene targeting. Furthermore, the resolution step of this intermediate occurs via the classic DSB repair model of HR while synthesis-dependent strand annealing and Holliday Junction dissolution are, at best, minor pathways. Moreover, and in contrast to other systems, the positions of Holliday Junction resolution are evenly distributed along the homology arms of the targeting vector. Most unexpectedly, we demonstrate that when a meganuclease is used to introduce a chromosomal DSB to augment gene targeting, the mechanism of gene targeting is inverted to an ends-in process. Finally, we demonstrate that the anti-recombination activity of mismatch repair is a significant impediment to gene targeting. These observations significantly advance our understanding of HR and gene targeting in human cells.
Gene targeting is important for basic research and clinical applications. In the laboratory, gene targeting is used to knockout genes so that loss-of-function phenotypes can be assessed. In the clinic, gene targeting is the gold standard to which most gene therapy approaches aspire. One of the most promising tools for gene targeting in humans is recombinant adeno-associated virus (rAAV). The mechanism by which rAAV performs gene targeting has, however, remained obscure. Here, we surprisingly demonstrate that the normally single-stranded rAAV performs gene targeting via double-stranded intermediates, which are mechanistically indistinguishable from standard plasmid-mediated gene targeting. Moreover, we establish the double-strand break (DSB) repair model as the paradigm to describe human gene targeting, and delineate the dynamics of crossovers in this model. Most unexpectedly, we demonstrate that when a meganuclease is used to introduce a chromosomal DSB to augment gene targeting, the mechanism of gene targeting is inverted such that the chromosome becomes the “attacker” instead of the “attackee”. Finally, we confirm that the anti-recombination activity of mismatch repair is a significant impediment to gene targeting. These observations advance our understanding of the mechanism of human gene targeting and should readily lend themselves to developing improvements to existing methodologies.
Mechanisms generating diverse cell types from multipotent progenitors are crucial for normal development. Neural crest cells (NCCs) are multipotent stem cells that give rise to numerous cell-types, including pigment cells. Medaka has four types of NCC-derived pigment cells (xanthophores, leucophores, melanophores and iridophores), making medaka pigment cell development an excellent model for studying the mechanisms controlling specification of distinct cell types from a multipotent progenitor. Medaka many leucophores-3 (ml-3) mutant embryos exhibit a unique phenotype characterized by excessive formation of leucophores and absence of xanthophores. We show that ml-3 encodes sox5, which is expressed in premigratory NCCs and differentiating xanthophores. Cell transplantation studies reveal a cell-autonomous role of sox5 in the xanthophore lineage. pax7a is expressed in NCCs and required for both xanthophore and leucophore lineages; we demonstrate that Sox5 functions downstream of Pax7a. We propose a model in which multipotent NCCs first give rise to pax7a-positive partially fate-restricted intermediate progenitors for xanthophores and leucophores; some of these progenitors then express sox5, and as a result of Sox5 action develop into xanthophores. Our results provide the first demonstration that Sox5 can function as a molecular switch driving specification of a specific cell-fate (xanthophore) from a partially-restricted, but still multipotent, progenitor (the shared xanthophore-leucophore progenitor).
How individual cell fates are specified from multipotent progenitor cells is a fundamental question in developmental and stem cell biology. Accumulating evidence indicates that stem cells develop into each of their final, diverse cell-types after progression through one or more partially-restricted intermediates, but the molecular mechanisms underlying final fate choice are largely unknown. Neural crest cells (NCCs) give rise to diverse cell-types including multiple pigment cells and thus are a favored model for understanding the mechanism of fate specification. We have investigated how a specific fate choice is made from partially-restricted pigment cell progenitors in medaka. We show that Sry-related transcription factor Sox5 is required for fate determination between yellow xanthophore and white leucophore, and its loss causes excessive formation of leucophores and absence of xanthophores. We demonstrate that Sox5 functions cell-autonomously in the xanthophore lineage in medaka. Furthermore, pax7a is expressed in the partially-restricted progenitor cells shared with xanthophore and leucophore lineages, and Sox5 acts in some of these cells to promote xanthophore lineage. Our work reveals the role of Sox5 as a molecular switch determining xanthophore versus leucophore fate choice from the shared progenitor, and identifies an important mechanism regulating pigment cell fate choice from NCCs.
Single Nucleotide Polymorphisms (SNPs) in genes involved in the DNA Base Excision Repair (BER) pathway could be associated with cancer risk in carriers of mutations in the high-penetrance susceptibility genes BRCA1 and BRCA2, given the relation of synthetic lethality that exists between one of the components of the BER pathway, PARP1 (poly ADP ribose polymerase), and both BRCA1 and BRCA2. In the present study, we have performed a comprehensive analysis of 18 genes involved in BER using a tagging SNP approach in a large series of BRCA1 and BRCA2 mutation carriers. 144 SNPs were analyzed in a two stage study involving 23,463 carriers from the CIMBA consortium (the Consortium of Investigators of Modifiers of BRCA1 and BRCA2). Eleven SNPs showed evidence of association with breast and/or ovarian cancer at p<0.05 in the combined analysis. Four of the five genes for which strongest evidence of association was observed were DNA glycosylases. The strongest evidence was for rs1466785 in the NEIL2 (endonuclease VIII-like 2) gene (HR: 1.09, 95% CI (1.03–1.16), p = 2.7×10−3) for association with breast cancer risk in BRCA2 mutation carriers, and rs2304277 in the OGG1 (8-guanine DNA glycosylase) gene, with ovarian cancer risk in BRCA1 mutation carriers (HR: 1.12 95%CI: 1.03–1.21, p = 4.8×10−3). DNA glycosylases involved in the first steps of the BER pathway may be associated with cancer risk in BRCA1/2 mutation carriers and should be more comprehensively studied.
Women harboring a germ-line mutation in the BRCA1 or BRCA2 genes have a high lifetime risk to develop breast and/or ovarian cancer. However, not all carriers develop cancer and high variability exists regarding age of onset of the disease and type of tumor. One of the causes of this variability lies in other genetic factors that modulate the phenotype, the so-called modifier genes. Identification of these genes might have important implications for risk assessment and decision making regarding prevention of the disease. Given that BRCA1 and BRCA2 participate in the repair of DNA double strand breaks, here we have investigated whether variations, Single Nucleotide Polymorphisms (SNPs), in genes participating in other DNA repair pathway may be associated with cancer risk in BRCA carriers. We have selected the Base Excision Repair pathway because BRCA defective cells are extremely sensitive to the inhibition of one of its components, PARP1. Thanks to a large international collaborative effort, we have been able to identify at least two SNPs that are associated with increased cancer risk in BRCA1 and BRCA2 mutation carriers respectively. These findings could have implications not only for risk assessment, but also for treatment of BRCA1/2 mutation carriers with PARP inhibitors.
Cleft palate (CP) is one of the most commonly occurring craniofacial birth defects in humans. In order to study cleft palate in a naturally occurring model system, we utilized the Nova Scotia Duck Tolling Retriever (NSDTR) dog breed. Micro-computed tomography analysis of CP NSDTR craniofacial structures revealed that these dogs exhibit defects similar to those observed in a recognizable subgroup of humans with CP: Pierre Robin Sequence (PRS). We refer to this phenotype in NSDTRs as CP1. Individuals with PRS have a triad of birth defects: shortened mandible, posteriorly placed tongue, and cleft palate. A genome-wide association study in 14 CP NSDTRs and 72 unaffected NSDTRs identified a significantly associated region on canine chromosome 14 (24.2 Mb–29.3 Mb; praw = 4.64×10−15). Sequencing of two regional candidate homeobox genes in NSDTRs, distal-less homeobox 5 (DLX5) and distal-less homeobox 6 (DLX6), identified a 2.1 kb LINE-1 insertion within DLX6 in CP1 NSDTRs. The LINE-1 insertion is predicted to insert a premature stop codon within the homeodomain of DLX6. This prompted the sequencing of DLX5 and DLX6 in a human cohort with CP, where a missense mutation within the highly conserved DLX5 homeobox of a patient with PRS was identified. This suggests the involvement of DLX5 in the development of PRS. These results demonstrate the power of the canine animal model as a genetically tractable approach to understanding naturally occurring craniofacial birth defects in humans.
Cleft palate is one of the most commonly occurring birth defects in children, and yet its cause is not completely understood. In order to better understand cleft palate we have turned to man's best friend, the domestic dog. Common breeding practices have made the dog a unique animal model to help understand the genetic basis of naturally occurring birth defects. A genome-wide association study of Nova Scotia Duck Tolling Retrievers with naturally occurring cleft palate led to the investigation of two homeobox genes, DLX5 and DLX6. Dogs with this mutation also have a shortened lower jaw, which resembles those who have Pierre Robin Sequence (PRS). Investigation into people with PRS identifies a mutation within a highly conserved and functional region of DLX5 that may contribute to the development of PRS. This exemplifies how the dog will help us better understand common birth defects.
Variants in the growth factor receptor-bound protein 10 (GRB10) gene were in a GWAS meta-analysis associated with reduced glucose-stimulated insulin secretion and increased risk of type 2 diabetes (T2D) if inherited from the father, but inexplicably reduced fasting glucose when inherited from the mother. GRB10 is a negative regulator of insulin signaling and imprinted in a parent-of-origin fashion in different tissues. GRB10 knock-down in human pancreatic islets showed reduced insulin and glucagon secretion, which together with changes in insulin sensitivity may explain the paradoxical reduction of glucose despite a decrease in insulin secretion. Together, these findings suggest that tissue-specific methylation and possibly imprinting of GRB10 can influence glucose metabolism and contribute to T2D pathogenesis. The data also emphasize the need in genetic studies to consider whether risk alleles are inherited from the mother or the father.
In this paper, we report the first large genome-wide association study in man for glucose-stimulated insulin secretion (GSIS) indices during an oral glucose tolerance test. We identify seven genetic loci and provide effects on GSIS for all previously reported glycemic traits and obesity genetic loci in a large-scale sample. We observe paradoxical effects of genetic variants in the growth factor receptor-bound protein 10 (GRB10) gene yielding both reduced GSIS and reduced fasting plasma glucose concentrations, specifically showing a parent-of-origin effect of GRB10 on lower fasting plasma glucose and enhanced insulin sensitivity for maternal and elevated glucose and decreased insulin sensitivity for paternal transmissions of the risk allele. We also observe tissue-specific differences in DNA methylation and allelic imbalance in expression of GRB10 in human pancreatic islets. We further disrupt GRB10 by shRNA in human islets, showing reduction of both insulin and glucagon expression and secretion. In conclusion, we provide evidence for complex regulation of GRB10 in human islets. Our data suggest that tissue-specific methylation and imprinting of GRB10 can influence glucose metabolism and contribute to T2D pathogenesis. The data also emphasize the need in genetic studies to consider whether risk alleles are inherited from the mother or the father.
Annotating and interpreting the results of genome-wide association studies (GWAS) remains challenging. Assigning function to genetic variants as expression quantitative trait loci is an expanding and useful approach, but focuses exclusively on mRNA rather than protein levels. Many variants remain without annotation. To address this problem, we measured the steady state abundance of 441 human signaling and transcription factor proteins from 68 Yoruba HapMap lymphoblastoid cell lines to identify novel relationships between inter-individual protein levels, genetic variants, and sensitivity to chemotherapeutic agents. Proteins were measured using micro-western and reverse phase protein arrays from three independent cell line thaws to permit mixed effect modeling of protein biological replicates. We observed enrichment of protein quantitative trait loci (pQTLs) for cellular sensitivity to two commonly used chemotherapeutics: cisplatin and paclitaxel. We functionally validated the target protein of a genome-wide significant trans-pQTL for its relevance in paclitaxel-induced apoptosis. GWAS overlap results of drug-induced apoptosis and cytotoxicity for paclitaxel and cisplatin revealed unique SNPs associated with the pharmacologic traits (at p<0.001). Interestingly, GWAS SNPs from various regions of the genome implicated the same target protein (p<0.0001) that correlated with drug induced cytotoxicity or apoptosis (p≤0.05). Two genes were functionally validated for association with drug response using siRNA: SMC1A with cisplatin response and ZNF569 with paclitaxel response. This work allows pharmacogenomic discovery to progress from the transcriptome to the proteome and offers potential for identification of new therapeutic targets. This approach, linking targeted proteomic data to variation in pharmacologic response, can be generalized to other studies evaluating genotype-phenotype relationships and provide insight into chemotherapeutic mechanisms.
The central dogma of biology explains that DNA is transcribed to mRNA that is further translated into protein. Many genome-wide studies have implicated genetic variation that influences gene expression and that ultimately affect downstream complex traits including response to drugs. However, because of technical limitations, few studies have evaluated the contribution of genetic variation on protein expression and ensuing effects on downstream phenotypes. To overcome this challenge, we used a novel technology to simultaneously measure the baseline expression of 441 proteins in lymphoblastoid cell lines and compared them with publicly available genetic data. To further illustrate the utility of this approach, we compared protein-level measurements with chemotherapeutic induced apoptosis and cell-growth inhibition data. This study demonstrates the importance of using protein information to understand the functional consequences of genetic variants identified in genome-wide association studies. This protein data set will also have broad utility for understanding the relationship between other genome-wide studies of complex traits.
Most organisms use 24-hr circadian clocks to keep temporal order and anticipate daily environmental changes. In Drosophila melanogaster CLOCK (CLK) and CYCLE (CYC) initiates the circadian system by promoting rhythmic transcription of hundreds of genes. However, it is still not clear whether high amplitude transcriptional oscillations are essential for circadian timekeeping. In order to address this issue, we generated flies in which the amplitude of CLK-driven transcription can be reduced partially (approx. 60%) or strongly (90%) without affecting the average levels of CLK-target genes. The impaired transcriptional oscillations lead to low amplitude protein oscillations that were not sufficient to drive outputs of peripheral oscillators. However, circadian rhythms in locomotor activity were resistant to partial reduction in transcriptional and protein oscillations. We found that the resilience of the brain oscillator is depending on the neuronal communication among circadian neurons in the brain. Indeed, the capacity of the brain oscillator to overcome low amplitude transcriptional oscillations depends on the action of the neuropeptide PDF and on the pdf-expressing cells having equal or higher amplitude of molecular rhythms than the rest of the circadian neuronal groups in the fly brain. Therefore, our work reveals the importance of high amplitude transcriptional oscillations for cell-autonomous circadian timekeeping. Moreover, we demonstrate that the circadian neuronal network is an essential buffering system that protects against changes in circadian transcription in the brain.
Circadian clocks allow organisms to predict daily environmental changes. These clocks time the sleep/wake cycles and many other physiological and cellular pathways to 24hs rhythms. The current model states that circadian clocks keep time by the use of biochemical feedback loops. These feedback loops are responsible for the generation of high amplitude oscillations in gene expression. Abolishment of circadian transcriptional oscillations has been shown to abolish circadian function. Previous studies addressing this issue utilize manipulations in which the abolishment of the transcriptional oscillations is very dramatic and involves strong up or down-regulation of circadian genes. In this study we generated fruit flies in which we diminished the amplitude of circadian oscillations in a controlled way. We found that a decrease of more than 50% in the amplitude of circadian oscillations leads to impaired function of circadian physiological outputs in the periphery but does not significantly affect circadian behavior. This suggests that the clock in the brain has a specific compensatory mechanism. Moreover, we found that flies with reduced oscillation and impaired circadian neuronal communication display aberrant circadian rhythms. These finding support the idea of network buffering mechanisms that allows the brain to produce circadian rhythms even with low amplitude molecular oscillations.
In animals, circadian rhythms in physiology and behavior result from coherent rhythmic interactions between clocks in the brain and those throughout the body. Despite the many tissue specific clocks, most understanding of the molecular core clock mechanism comes from studies of the suprachiasmatic nuclei (SCN) of the hypothalamus and a few other cell types. Here we report establishment and genetic characterization of three cell-autonomous mouse clock models: 3T3 fibroblasts, 3T3-L1 adipocytes, and MMH-D3 hepatocytes. Each model is genetically tractable and has an integrated luciferase reporter that allows for longitudinal luminescence recording of rhythmic clock gene expression using an inexpensive off-the-shelf microplate reader. To test these cellular models, we generated a library of short hairpin RNAs (shRNAs) against a panel of known clock genes and evaluated their impact on circadian rhythms. Knockdown of Bmal1, Clock, Cry1, and Cry2 each resulted in similar phenotypes in all three models, consistent with previous studies. However, we observed cell type-specific knockdown phenotypes for the Period and Rev-Erb families of clock genes. In particular, Per1 and Per2, which have strong behavioral effects in knockout mice, appear to play different roles in regulating period length and amplitude in these peripheral systems. Per3, which has relatively modest behavioral effects in knockout mice, substantially affects period length in the three cellular models and in dissociated SCN neurons. In summary, this study establishes new cell-autonomous clock models that are of particular relevance to metabolism and suitable for screening for clock modifiers, and reveals previously under-appreciated cell type-specific functions of clock genes.
Various aspects of our daily rhythms in physiology and behavior such as the sleep-wake cycle are regulated by endogenous circadian clocks that are present in nearly every cell. It is generally accepted that these oscillators share a similar biochemical negative feedback mechanism, consisting of transcriptional activators and repressors. In this study, we developed cell-autonomous, metabolically relevant clock models in mouse hepatocytes and adipocytes. Each clock model has an integrated luciferase reporter that allows for kinetic luminescence recording with an inexpensive microplate reader and thus is feasible for most laboratories. These models are amenable to high throughput screening of small molecules or genomic entities for impacts on cell-autonomous clocks relevant to metabolism. We validated these new models by RNA interference via lentivirus-mediated knockdown of known clock genes. As expected, we found that many core clock components have similar functions across cell types. To our surprise, however, we also uncovered previously under-appreciated cell type-specific functions of core clock genes, particularly Per1, Per2, and Per3. Because the circadian system is integrated with, and influenced by, the local physiology that is under its control, our studies provide important implications for future studies into cell type-specific mechanisms of various circadian systems.
Whole genome sequencing of cancer genomes has revealed a diversity of recurrent gross chromosomal rearrangements (GCRs) that are likely signatures of specific defects in DNA damage response pathways. However, inferring the underlying defects has been difficult due to insufficient information relating defects in DNA metabolism to GCR signatures. By analyzing over 95 mutant strains of Saccharomyces cerevisiae, we found that the frequency of GCRs that deleted an internal CAN1/URA3 cassette on chrV L while retaining a chrV L telomeric hph marker was significantly higher in tel1Δ, sae2Δ, rad53Δ sml1Δ, and mrc1Δ tof1Δ mutants. The hph-retaining GCRs isolated from tel1Δ mutants contained either an interstitial deletion dependent on non-homologous end-joining or an inverted duplication that appeared to be initiated from a double strand break (DSB) on chrV L followed by hairpin formation, copying of chrV L from the DSB toward the centromere, and homologous recombination to capture the hph-containing end of chrV L. In contrast, hph-containing GCRs from other mutants were primarily interstitial deletions (mrc1Δ tof1Δ) or inverted duplications (sae2Δ and rad53Δ sml1Δ). Mutants with impaired de novo telomere addition had increased frequencies of hph-containing GCRs, whereas mutants with increased de novo telomere addition had decreased frequencies of hph-containing GCRs. Both types of hph-retaining GCRs occurred in wild-type strains, suggesting that the increased frequencies of hph retention were due to the relative efficiencies of competing DNA repair pathways. Interestingly, the inverted duplications observed here resemble common GCRs in metastatic pancreatic cancer.
Recent advances in the sequencing of human cancer genomes have revealed that some types of genome rearrangements are more common in specific types of cancers. Thus, these cancers may share defects in DNA repair mechanisms, which may play roles in initiation or progression of the disease and may be useful therapeutically. Linking a common rearrangement signature to a specific genetic or epigenetic alteration is currently challenging, because we do not know which rearrangement signatures are linked to which DNA repair defects. Here we used a genetic assay in the model organism Saccharomyces cerevisiae to specifically link two classes of chromosomal rearrangements, interstitial deletions and inverted duplications, to specific genetic defects. These results begin to map out the links between observed chromosomal rearrangements and specific DNA repair defects and in the present case, may provide insights into the chromosomal rearrangements frequently observed in metastatic pancreatic cancer.
The miR156-targeted SQUAMOSA PROMOTER BINDING PROTEIN LIKE (SPL) transcription factors function as an endogenous age cue in regulating plant phase transition and phase-dependent morphogenesis, but the control of SPL output remains poorly understood. In Arabidopsis thaliana the spatial pattern of trichome is a hallmark of phase transition and governed by SPLs. Here, by dissecting the regulatory network controlling trichome formation on stem, we show that the miR171-targeted LOST MERISTEMS 1 (LOM1), LOM2 and LOM3, encoding GRAS family members previously known to maintain meristem cell polarity, are involved in regulating the SPL activity. Reduced LOM abundance by overexpression of miR171 led to decreased trichome density on stems and floral organs, and conversely, constitutive expression of the miR171-resistant LOM (rLOM) genes promoted trichome production, indicating that LOMs enhance trichome initiation at reproductive stage. Genetic analysis demonstrated LOMs shaping trichome distribution is dependent on SPLs, which positively regulate trichome repressor genes TRICHOMELESS 1 (TCL1) and TRIPTYCHON (TRY). Physical interaction between the N-terminus of LOMs and SPLs underpins the repression of SPL activity. Importantly, other growth and developmental events, such as flowering, are also modulated by LOM-SPL interaction, indicating a broad effect of the LOM-SPL interplay. Furthermore, we provide evidence that MIR171 gene expression is regulated by its targeted LOMs, forming a homeostatic feedback loop. Our data uncover an antagonistic interplay between the two timing miRNAs in controlling plant growth, phase transition and morphogenesis through direct interaction of their targets.
MicroRNAs are important aging regulators in many organisms. In Arabidopsis the miR156-targeted SQUAMOSA PROMOTER BINDING PROTEIN LIKE (SPL) transcription factors play important roles as an endogenous age cue in programming phase transition and phase-dependent morphogenesis, including trichome patterning. However, how the timely increasing SPL output is modulated remains elusive. By dissecting the regulatory network controlling trichome formation on stem, we show that a group of GRAS family members, LOST MERISTEMS 1 (LOM1), LOM2 and LOM3, targeted by timing miR171, function in modulating the SPL activity through direct protein-protein interaction. LOMs promote trichome formation through attenuating the SPL (such as SPL9) activity of trichome repression. The LOM-SPL interaction affects many aspects of plant growth and development, including flowering, aging and chlorophyll biosynthesis. Interestingly, MIR171A gene expression is regulated by its own targets (LOMs), forming a feedback loop to program plant life. Our study establishes an age-dependent regulatory network composed of two timing miRNAs which act oppositely through direct interaction of their target proteins.
The use of model organisms as tools for the investigation of human genetic variation has significantly and rapidly advanced our understanding of the aetiologies underlying hereditary traits. However, while equivalences in the DNA sequence of two species may be readily inferred through evolutionary models, the identification of equivalence in the phenotypic consequences resulting from comparable genetic variation is far from straightforward, limiting the value of the modelling paradigm. In this review, we provide an overview of the emerging statistical and computational approaches to objectively identify phenotypic equivalence between human and model organisms with examples from the vertebrate models, mouse and zebrafish. Firstly, we discuss enrichment approaches, which deem the most frequent phenotype among the orthologues of a set of genes associated with a common human phenotype as the orthologous phenotype, or phenolog, in the model species. Secondly, we introduce and discuss computational reasoning approaches to identify phenotypic equivalences made possible through the development of intra- and interspecies ontologies. Finally, we consider the particular challenges involved in modelling neuropsychiatric disorders, which illustrate many of the remaining difficulties in developing comprehensive and unequivocal interspecies phenotype mappings.
Intellectual disability and seizures are frequently associated with hypomagnesemia and have an important genetic component. However, to find the genetic origin of intellectual disability and seizures often remains challenging because of considerable genetic heterogeneity and clinical variability. In this study, we have identified new mutations in CNNM2 in five families suffering from mental retardation, seizures, and hypomagnesemia. For the first time, a recessive mode of inheritance of CNNM2 mutations was observed. Importantly, patients with recessive CNNM2 mutations suffer from brain malformations and severe intellectual disability. Additionally, three patients with moderate mental disability were shown to carry de novo heterozygous missense mutations in the CNNM2 gene. To elucidate the physiological role of CNNM2 and explain the pathomechanisms of disease, we studied CNNM2 function combining in vitro activity assays and the zebrafish knockdown model system. Using stable Mg2+ isotopes, we demonstrated that CNNM2 increases cellular Mg2+ uptake in HEK293 cells and that this process occurs through regulation of the Mg2+-permeable cation channel TRPM7. In contrast, cells expressing mutated CNNM2 proteins did not show increased Mg2+ uptake. Knockdown of cnnm2 isoforms in zebrafish resulted in disturbed brain development including neurodevelopmental impairments such as increased embryonic spontaneous contractions and weak touch-evoked escape behaviour, and reduced body Mg content, indicative of impaired renal Mg2+ absorption. These phenotypes were rescued by injection of mammalian wild-type Cnnm2 cRNA, whereas mammalian mutant Cnnm2 cRNA did not improve the zebrafish knockdown phenotypes. We therefore concluded that CNNM2 is fundamental for brain development, neurological functioning and Mg2+ homeostasis. By establishing the loss-of-function zebrafish model for CNNM2 genetic disease, we provide a unique system for testing therapeutic drugs targeting CNNM2 and for monitoring their effects on the brain and kidney phenotype.
Mental retardation affects 1–3% of the population and has a strong genetic etiology. Consequently, early identification of the genetic causes of mental retardation is of significant importance in the diagnosis of the disease, as predictor of the progress of the disease and for the determination of treatment. In this study, we identify mutations in the gene encoding for cyclin M2 (CNNM2) to be causative for mental retardation and seizures in patients with hypomagnesemia. Particularly, in patients with a recessive mode of inheritance, the intellectual disability caused by dysfunctional CNNM2 is dramatically severe and is accompanied by severely limited motor skills and brain malformations suggestive of impaired early brain development. Although hypomagnesemia has been associated to several neurological diseases, Mg2+ status is not regularly assessed in patients with seizures and mental disability. Our findings establish CNNM2 as an important protein for renal magnesium handling, brain development and neurological functioning, thus explaining the physiology of human disease caused by (dysfunctional) mutations in CNNM2. CNNM2 mutations should be taken into account in patients with seizures and mental disability, specifically in combination with hypomagnesemia.
The Myc family of transcription factors regulates a variety of biological processes, including the cell cycle, growth, proliferation, metabolism, and apoptosis. In Caenorhabditis elegans, the “Myc interaction network” consists of two opposing heterodimeric complexes with antagonistic functions in transcriptional control: the Myc-Mondo:Mlx transcriptional activation complex and the Mad:Max transcriptional repression complex. In C. elegans, Mondo, Mlx, Mad, and Max are encoded by mml-1, mxl-2, mdl-1, and mxl-1, respectively. Here we show a similar antagonistic role for the C. elegans Myc-Mondo and Mad complexes in longevity control. Loss of mml-1 or mxl-2 shortens C. elegans lifespan. In contrast, loss of mdl-1 or mxl-1 increases longevity, dependent upon MML-1:MXL-2. The MML-1:MXL-2 and MDL-1:MXL-1 complexes function in both the insulin signaling and dietary restriction pathways. Furthermore, decreased insulin-like/IGF-1 signaling (ILS) or conditions of dietary restriction increase the accumulation of MML-1, consistent with the notion that the Myc family members function as sensors of metabolic status. Additionally, we find that Myc family members are regulated by distinct mechanisms, which would allow for integrated control of gene expression from diverse signals of metabolic status. We compared putative target genes based on ChIP-sequencing data in the modENCODE project and found significant overlap in genomic DNA binding between the major effectors of ILS (DAF-16/FoxO), DR (PHA-4/FoxA), and Myc family (MDL-1/Mad/Mxd) at common target genes, which suggests that diverse signals of metabolic status converge on overlapping transcriptional programs that influence aging. Consistent with this, there is over-enrichment at these common targets for genes that function in lifespan, stress response, and carbohydrate metabolism. Additionally, we find that Myc family members are also involved in stress response and the maintenance of protein homeostasis. Collectively, these findings indicate that Myc family members integrate diverse signals of metabolic status, to coordinate overlapping metabolic and cytoprotective transcriptional programs that determine the progression of aging.
Transcription factors are essential proteins that regulate the expression of genes and play an important role in most biological processes. The results of our study presented here demonstrate for the first time a role in aging for a small family of transcription factors in the nematode worm Caenorhabditis elegans. Importantly, these proteins have close relatives in higher organisms, including humans that influence metabolism, cell replication, and have been implicated in the development of cancer. Moreover, the loss of one homologue has also been implicated in Williams-Beuren syndrome, a disease characterized in part by signs of premature aging. Our data demonstrate that these transcription factors function within insulin/IGF-1 signaling and dietary restriction, two highly conserved pathways that link nutrient sensing to longevity. Taken together, our findings provide exciting new insight into a family of proteins that may be essential for linking nutrient sensing to longevity and have implications for the improvement of human healthspan.
As in many species, gustatory pheromones regulate the mating behavior of Drosophila. Recently, several ppk genes, encoding ion channel subunits of the DEG/ENaC family, have been implicated in this process, leading to the identification of gustatory neurons that detect specific pheromones. In a subset of taste hairs on the legs of Drosophila, there are two ppk23-expressing, pheromone-sensing neurons with complementary response profiles; one neuron detects female pheromones that stimulate male courtship, the other detects male pheromones that inhibit male-male courtship. In contrast to ppk23, ppk25, is only expressed in a single gustatory neuron per taste hair, and males with impaired ppk25 function court females at reduced rates but do not display abnormal courtship of other males. These findings raised the possibility that ppk25 expression defines a subset of pheromone-sensing neurons. Here we show that ppk25 is expressed and functions in neurons that detect female-specific pheromones and mediates their stimulatory effect on male courtship. Furthermore, the role of ppk25 and ppk25-expressing neurons is not restricted to responses to female-specific pheromones. ppk25 is also required in the same subset of neurons for stimulation of male courtship by young males, males of the Tai2 strain, and by synthetic 7-pentacosene (7-P), a hydrocarbon normally found at low levels in both males and females. Finally, we unexpectedly find that, in females, ppk25 and ppk25-expressing cells regulate receptivity to mating. In the absence of the third antennal segment, which has both olfactory and auditory functions, mutations in ppk25 or silencing of ppk25-expressing neurons block female receptivity to males. Together these results indicate that ppk25 identifies a functionally specialized subset of pheromone-sensing neurons. While ppk25 neurons are required for the responses to multiple pheromones, in both males and females these neurons are specifically involved in stimulating courtship and mating.
Drosophila mating behaviors serve as an attractive model to understand how external sensory cues are detected and used to generate appropriate behavioral responses. Pheromones present on the cuticle of Drosophila have important roles in stimulating male courtship toward females and inhibiting male courtship directed at other males. Recently, stimulatory pheromones emitted by females and inhibitory pheromones emitted by males have been shown to stimulate distinct subsets of gustatory neurons on the legs. We have previously shown that a DEG/ENaC ion channel subunit, ppk25, is involved in male courtship toward females but not in inhibition of male-male courtship. Here we show that ppk25 is specifically expressed and functions in a subset of gustatory neurons that mediate physiological and behavioral responses to female-specific stimulatory pheromones. Furthermore, ppk25 is also required for the function of those neurons to activate male courtship in response to other pheromones that are not female-specific. In addition to their roles in males, we find that ppk25, and the related DEG/ENaC subunits ppk23 and ppk29, also stimulate female mating behavior. In conclusion, these results show that, in both sexes, ppk25 functions in a group of neurons with a specialized role in stimulating mating behaviors.
Insulin-like peptides (ILPs) play highly conserved roles in development and physiology. Most animal genomes encode multiple ILPs. Here we identify mechanisms for how the forty Caenorhabditis elegans ILPs coordinate diverse processes, including development, reproduction, longevity and several specific stress responses. Our systematic studies identify an ILP-based combinatorial code for these phenotypes characterized by substantial functional specificity and diversity rather than global redundancy. Notably, we show that ILPs regulate each other transcriptionally, uncovering an ILP-to-ILP regulatory network that underlies the combinatorial phenotypic coding by the ILP family. Extensive analyses of genetic interactions among ILPs reveal how their signals are integrated. A combined analysis of these functional and regulatory ILP interactions identifies local genetic circuits that act in parallel and interact by crosstalk, feedback and compensation. This organization provides emergent mechanisms for phenotypic specificity and graded regulation for the combinatorial phenotypic coding we observe. Our findings also provide insights into how large hormonal networks regulate diverse traits.
Insulin signaling is widely implicated in regulating diverse physiological processes ranging from metabolism to longevity across many animal species. Many animals have multiple insulin-like peptides that can regulate the activity of this signaling pathway. For example, while humans have ten, including the well-studied insulin hormone, the nematode Caenorhabditis elegans has forty such peptides. The similarity among these insulin-like peptides led to the predominant notion that widespread redundancy occurs among these peptides. Contrary to this notion, we find that the forty insulin-like peptides in the nematode C. elegans have specific and distinct effects on eight different physiological outputs that range from development, stress responses, lifespan and reproduction. Interestingly, we also find that these peptides regulate each other at the transcriptional level to form a signaling network. In addition, we observe that this network is organized into parallel circuits, whose activities are affected by compensation, feedback and crosstalk. Finally, the organization of the network helps to explain how different combinations of peptides generate specific outputs and captures the complexity of how these peptides orchestrate an animal's physiology through distinct peptide-to-peptide signaling circuits.
We report a phenomenon wherein induction of cell death by a variety of means in wing imaginal discs of Drosophila larvae resulted in the activation of an anti-apoptotic microRNA, bantam. Cells in the vicinity of dying cells also become harder to kill by ionizing radiation (IR)-induced apoptosis. Both ban activation and increased protection from IR required receptor tyrosine kinase Tie, which we identified in a genetic screen for modifiers of ban. tie mutants were hypersensitive to radiation, and radiation sensitivity of tie mutants was rescued by increased ban gene dosage. We propose that dying cells activate ban in surviving cells through Tie to make the latter cells harder to kill, thereby preserving tissues and ensuring organism survival. The protective effect we report differs from classical radiation bystander effect in which neighbors of irradiated cells become more prone to death. The protective effect also differs from the previously described effect of dying cells that results in proliferation of nearby cells in Drosophila larval discs. If conserved in mammals, a phenomenon in which dying cells make the rest harder to kill by IR could have implications for treatments that involve the sequential use of cytotoxic agents and radiation therapy.
In multicellular organisms where cells exist in the context of other cells, the behavior of one affects the others. The consequences of such interactions include not just cell fate choices but also life and death decisions. In the wing primordia of Drosophila melanogaster larvae, dying cells release mitogenic signals that stimulate the neighbors to proliferate. Such an effect is proposed to compensate for cell loss and help regenerate the tissue. We report here that, in the same experimental system, dying cells activate a pro-survival microRNA, bantam, in surviving cells. This results in increased protection from the killing effect of ionizing radiation (IR). Activation of ban requires tie, which encodes a receptor tyrosine kinase. tie and ban mutant larvae are hypersensitive to killing by IR, suggesting that the responses described here are important for organismal survival following radiation exposure.
Ovarian cancer is the fifth leading cause of cancer death in women. Almost 70% of ovarian cancer deaths are due to the high-grade serous subtype, which is typically detected only after it has metastasized. Characterization of high-grade serous cancer is further complicated by the significant heterogeneity and genome instability displayed by this cancer. Other than mutations in TP53, which is common to many cancers, highly recurrent recombinant events specific to this cancer have yet to be identified. Using high-throughput transcriptome sequencing of seven patient samples combined with experimental validation at DNA, RNA and protein levels, we identified a cancer-specific and inter-chromosomal fusion gene CDKN2D-WDFY2 that occurs at a frequency of 20% among sixty high-grade serous cancer samples but is absent in non-cancerous ovary and fallopian tube samples. This is the most frequent recombinant event identified so far in high-grade serous cancer implying a major cellular lineage in this highly heterogeneous cancer. In addition, the same fusion transcript was also detected in OV-90, an established high-grade serous type cell line. The genomic breakpoint was identified in intron 1 of CDKN2D and intron 2 of WDFY2 in patient tumor, providing direct evidence that this is a fusion gene. The parental gene, CDKN2D, is a cell-cycle modulator that is also involved in DNA repair, while WDFY2 is known to modulate AKT interactions with its substrates. Transfection of cloned fusion construct led to loss of wildtype CDKN2D and wildtype WDFY2 protein expression, and a gain of a short WDFY2 protein isoform that is presumably under the control of the CDKN2D promoter. The expression of short WDFY2 protein in transfected cells appears to alter the PI3K/AKT pathway that is known to play a role in oncogenesis. CDKN2D-WDFY2 fusion could be an important molecular signature for understanding and classifying sub-lineages among heterogeneous high-grade serous ovarian carcinomas.
High-grade serous carcinoma (HG-SC) is the most common subtype of ovarian cancer observed in women. This subtype of ovarian cancer is typically detected at advanced stages due to lack of effective early screening tools. Recurrent cancer-specific gene fusions resulting from chromosomal translocations have the potential to serve as effective screening tools as well as therapeutic targets. Here we identified CDKN2D-WDFY2 as a cancer-specific fusion gene present in 20% of HG-SC tumors, by far the most frequent gene recombinant event found in this highly heterogeneous disease. We also presented evidence that the expression of this fusion may affect the PI3K/AKT pathway that is important for cancer progression. Thus CDKN2D-WDFY2 could very well represent a major cellular lineage important for detecting and classifying heterogeneous ovarian carcinomas, and could provide insight into the underlying mechanism of this deadly disease. This is critical, given that ovarian cancer kills 140,200 women worldwide each year, and few ovarian cancer-specific molecular alterations are currently available for targeting and screening.
Pervasive natural selection can strongly influence observed patterns of genetic variation, but these effects remain poorly understood when multiple selected variants segregate in nearby regions of the genome. Classical population genetics fails to account for interference between linked mutations, which grows increasingly severe as the density of selected polymorphisms increases. Here, we describe a simple limit that emerges when interference is common, in which the fitness effects of individual mutations play a relatively minor role. Instead, similar to models of quantitative genetics, molecular evolution is determined by the variance in fitness within the population, defined over an effectively asexual segment of the genome (a “linkage block”). We exploit this insensitivity in a new “coarse-grained” coalescent framework, which approximates the effects of many weakly selected mutations with a smaller number of strongly selected mutations that create the same variance in fitness. This approximation generates accurate and efficient predictions for silent site variability when interference is common. However, these results suggest that there is reduced power to resolve individual selection pressures when interference is sufficiently widespread, since a broad range of parameters possess nearly identical patterns of silent site variability.
A central goal of evolutionary genetics is to understand how natural selection influences DNA sequence variability. Yet while empirical studies have uncovered significant evidence for selection in many natural populations, a rigorous characterization of these selection pressures has so far been difficult to achieve. The problem is that when selection acts on linked loci, it introduces correlations along the genome that are difficult to disentangle. These “interference” effects have been extensively studied in simulation, but theory still struggles to account for interference in predicted patterns of sequence variability, which limits the quantitative conclusions that can be drawn from modern sequence data. Here, we show that in spite of this complexity, simple behavior emerges in the limit that interference is common. Patterns of molecular evolution depend on the variance in fitness within the population, and are only weakly influenced by the fitness effects of individual mutations. We leverage this “emergent simplicity” to establish a new framework for predicting genetic diversity in these populations. Our results have important practical implications for the interpretation of natural sequence variability, particularly in regions of low recombination, and suggest an inherent “resolution limit” for the quantitative inference of selection pressures from sequence polymorphism data.
Polymorphisms that affect complex traits or quantitative trait loci (QTL) often affect multiple traits. We describe two novel methods (1) for finding single nucleotide polymorphisms (SNPs) significantly associated with one or more traits using a multi-trait, meta-analysis, and (2) for distinguishing between a single pleiotropic QTL and multiple linked QTL. The meta-analysis uses the effect of each SNP on each of n traits, estimated in single trait genome wide association studies (GWAS). These effects are expressed as a vector of signed t-values (t) and the error covariance matrix of these t values is approximated by the correlation matrix of t-values among the traits calculated across the SNP (V). Consequently, t'V−1t is approximately distributed as a chi-squared with n degrees of freedom. An attractive feature of the meta-analysis is that it uses estimated effects of SNPs from single trait GWAS, so it can be applied to published data where individual records are not available. We demonstrate that the multi-trait method can be used to increase the power (numbers of SNPs validated in an independent population) of GWAS in a beef cattle data set including 10,191 animals genotyped for 729,068 SNPs with 32 traits recorded, including growth and reproduction traits. We can distinguish between a single pleiotropic QTL and multiple linked QTL because multiple SNPs tagging the same QTL show the same pattern of effects across traits. We confirm this finding by demonstrating that when one SNP is included in the statistical model the other SNPs have a non-significant effect. In the beef cattle data set, cluster analysis yielded four groups of QTL with similar patterns of effects across traits within a group. A linear index was used to validate SNPs having effects on multiple traits and to identify additional SNPs belonging to these four groups.
We describe novel methods for finding significant associations between a genome wide panel of SNPs and multiple complex traits, and further for distinguishing between genes with effects on multiple traits and multiple linked genes affecting different traits. The method uses a meta-analysis based on estimates of SNP effects from independent single trait genome wide association studies (GWAS). The method could therefore be widely used to combine already published GWAS results. The method was applied to 32 traits that describe growth, body composition, feed intake and reproduction in 10,191 beef cattle genotyped for approximately 700,000 SNP. The genes found to be associated with these traits can be arranged into 4 groups that differ in their pattern of effects and hence presumably in their physiological mechanism of action. For instance, one group of genes affects weight and fatness in the opposite direction and can be described as a group of genes affecting mature size, while another group affects weight and fatness in the same direction.
Clonally derived bacterial populations exhibit significant genotypic and phenotypic diversity that contribute to fitness in rapidly changing environments. Here, we show that serial passage of Salmonella enterica serovar Typhimurium LT2 (StLT2) in broth, or within a mouse host, results in selection of an evolved population that inhibits the growth of ancestral cells by direct contact. Cells within each evolved population gain the ability to express and deploy a cryptic “orphan” toxin encoded within the rearrangement hotspot (rhs) locus. The Rhs orphan toxin is encoded by a gene fragment located downstream of the “main” rhs gene in the ancestral strain StLT2. The Rhs orphan coding sequence is linked to an immunity gene, which encodes an immunity protein that specifically blocks Rhs orphan toxin activity. Expression of the Rhs orphan immunity protein protects ancestral cells from the evolved lineages, indicating that orphan toxin activity is responsible for the observed growth inhibition. Because the Rhs orphan toxin is encoded by a fragmented reading frame, it lacks translation initiation and protein export signals. We provide evidence that evolved cells undergo recombination between the main rhs gene and the rhs orphan toxin gene fragment, yielding a fusion that enables expression and delivery of the orphan toxin. In this manner, rhs locus rearrangement provides a selective advantage to a subpopulation of cells. These observations suggest that rhs genes play important roles in intra-species competition and bacterial evolution.
Salmonella Typhimurium is a bacterium that causes intestinal diseases in a number of animals including humans. In mice, this pathogen invades tissues, causing symptoms similar to typhoid fever. In an effort to understand the evolution of this pathogen, we grew S. Typhimurium in either liquid broth or in mice for many generations and examined the resulting “evolved” strains to determine if they were different from the original “parent” culture. We found that many of these evolved strains inhibited the growth of the parent after they were mixed together, and that this growth inhibition requires that the evolved and parental cells are in close contact. Genetic analysis showed that this contact-dependent growth inhibition requires Rhs protein, which has a toxic tip. Salmonella is normally resistant to its Rhs toxin because it also produces an immunity protein that blocks toxin activity. However, evolved cells have undergone a DNA rearrangement that allows them to express a different Rhs toxic tip that inhibits growth of the parental cells, which lack immunity to it. This allows the evolved cells to outgrow the original parental cells. Our work indicates that populations of Salmonella are dynamic, with individuals battling with each other for dominance.
The antigenic repertoire presented by MHC molecules is generated by the antigen processing and presentation (APP) pathway. We analyzed the evolutionary history of 45 genes involved in APP at the inter- and intra-species level. Results showed that 11 genes evolved adaptively in mammals. Several positively selected sites involve positions of fundamental importance to the protein function (e.g. the TAP1 peptide-binding domains, the sugar binding interface of langerin, and the CD1D trafficking signal region). In CYBB, all selected sites cluster in two loops protruding into the endosomal lumen; analysis of missense mutations responsible for chronic granulomatous disease (CGD) showed the action of different selective forces on the very same gene region, as most CGD substitutions involve aminoacid positions that are conserved in all mammals. As for ERAP2, different computational methods indicated that positive selection has driven the recurrent appearance of protein-destabilizing variants during mammalian evolution. Application of a population-genetics phylogenetics approach showed that purifying selection represented a major force acting on some APP components (e.g. immunoproteasome subunits and chaperones) and allowed identification of positive selection events in the human lineage.
We also investigated the evolutionary history of APP genes in human populations by developing a new approach that uses several different tests to identify the selection target, and that integrates low-coverage whole-genome sequencing data with Sanger sequencing. This analysis revealed that 9 APP genes underwent local adaptation in human populations. Most positive selection targets are located within noncoding regions with regulatory function in myeloid cells or act as expression quantitative trait loci. Conversely, balancing selection targeted nonsynonymous variants in TAP1 and CD207 (langerin). Finally, we suggest that selected variants in PSMB10 and CD207 contribute to human phenotypes. Thus, we used evolutionary information to generate experimentally-testable hypotheses and to provide a list of sites to prioritize in follow-up analyses.
Antigen-presenting cells digest intracellular and extracellular proteins and display the resulting antigenic repertoire on cell surface molecules for recognition by T cells. This process initiates cell-mediated immune responses and is essential to detect infections. The antigenic repertoire is generated by the antigen processing and presentation pathway. Because several pathogens evade immune recognition by hampering this process, genes involved in antigen processing and presentation may represent common natural selection targets. Thus, we analyzed the evolutionary history of these genes during mammalian evolution and in the more recent history of human populations. Evolutionary analyses in mammals indicated that positive selection targeted a very high proportion of genes (24%), and revealed that many selected sites affect positions of fundamental importance to the protein function. In humans, we found different signatures of natural selection acting both on regions that are expected to regulate gene expression levels or timing and on coding variants; two human selected polymorphisms may modulate the susceptibility to Crohn's disease and to HIV-1 infection. Therefore, we provide a comprehensive evolutionary analysis of antigen processing and we show that evolutionary studies can provide useful information concerning the location and nature of functional variants, ultimately helping to clarify phenotypic differences between and within species.
TBX3 is a member of the T-box family of transcription factors with critical roles in development, oncogenesis, cell fate, and tissue homeostasis. TBX3 mutations in humans cause complex congenital malformations and Ulnar-mammary syndrome. Previous investigations into TBX3 function focused on its activity as a transcriptional repressor. We used an unbiased proteomic approach to identify TBX3 interacting proteins in vivo and discovered that TBX3 interacts with multiple mRNA splicing factors and RNA metabolic proteins. We discovered that TBX3 regulates alternative splicing in vivo and can promote or inhibit splicing depending on context and transcript. TBX3 associates with alternatively spliced mRNAs and binds RNA directly. TBX3 binds RNAs containing TBX binding motifs, and these motifs are required for regulation of splicing. Our study reveals that TBX3 mutations seen in humans with UMS disrupt its splicing regulatory function. The pleiotropic effects of TBX3 mutations in humans and mice likely result from disrupting at least two molecular functions of this protein: transcriptional regulation and pre-mRNA splicing.
TBX3 is a protein with essential roles in development and tissue homeostasis, and is implicated in cancer pathogenesis. TBX3 mutations in humans cause a complex of birth defects called Ulnar-mammary syndrome (UMS). Despite the importance of TBX3 and decades of investigation, few TBX3 partner proteins have been identified and little is known about how it functions in cells. Unlike previous investigations focused on TBX3 as DNA binding factor that represses transcription, we took an unbiased approach to identify TBX3 partner proteins in mouse embryos and human cells. We discovered that TBX3 interacts with RNA binding proteins and binds mRNAs to regulate how they are spliced. The different mutations seen in human UMS patients produce mutant proteins that interact with different partners and have different splicing activities. TBX3 promotes or inhibits splicing depending on cellular context, its partner proteins, and the target mRNA. Eukaryotic cells have many more proteins than genes: alternative splicing is critical to generate the different mRNAs needed for production of the specific and vast repertoire of proteins a cell produces. Our finding that TBX3 regulates this process provides fundamental new insights into how altered quantity and molecular function of TBX3 contribute to human developmental disorders and cancer.