Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application.
Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/
Supplementary data are available at Bioinformatics online.
Antiviral responses must be tightly regulated to rapidly defend against infection while minimizing inflammatory damage. Type 1 interferons (IFN-I) are crucial mediators of antiviral responses1 and their transcription is regulated by a variety of transcription factors2; principal amongst these is the family of interferon regulatory factors (IRFs)3. The IRF gene regulatory networks are complex and contain multiple feedback loops. The tools of systems biology are well suited to elucidate the complex interactions that give rise to precise coordination of the interferon response. Here we have used an unbiased systems approach to predict that a member of the forkhead family of transcription factors, FOXO3, is a negative regulator of a subset of antiviral genes. This prediction was validated using macrophages isolated from Foxo3-null mice. Genome-wide location analysis combined with gene deletion studies identified the Irf7 gene as a critical target of FOXO3. FOXO3 was identified as a negative regulator of Irf7 transcription and we have further demonstrated that FOXO3, IRF7 and IFN-I form a coherent feed-forward regulatory circuit. Our data suggest that the FOXO3-IRF7 regulatory circuit represents a novel mechanism for establishing the requisite set points in the interferon pathway that balances the beneficial effects and deleterious sequelae of the antiviral response.
The t(12;21) translocation which generates the ETV6-RUNX1 (TEL-AML1) fusion gene, is the most common chromosomal rearrangement in childhood cancer and is exclusively associated with B-cell precursor acute lymphoblastic leukemia (BCP-ALL). The translocation arises in utero and is necessary but insufficient for the development of leukemia. SNP array analysis of ETV6-RUNX1 patient samples have identified multiple additional genetic alterations, however the role of these lesions in leukemogenesis remains undetermined. Moreover, murine models of ETV6-RUNX1 ALL that faithfully recapitulate the human disease are lacking. To identify novel genes that co-operate with ETV6-RUNX1 in leukemogenesis, we generated a mouse model that uses the endogenous Etv6 locus to co-express the ETV6-RUNX1 fusion and Sleeping Beauty (SB) transposase. An insertional mutagenesis screen was performed by intercrossing these mice with those carrying a SB transposon array. In contrast to previous models, a substantial proportion (20%) of the offspring developed BCP-ALL. Isolation of the transposon insertion sites identified genes known to be associated with BCP-ALL, including Ebf1 and Epor, in addition to other novel candidates. This is the first mouse model of ETV6-RUNX1 to develop BCP-ALL and provides important insights into the cooperating genetic alterations in ETV6-RUNX1 leukemia.
ETV6-RUNX1; leukemia; precursor-B cell; insertional mutagenesis
Pancreatic ductal adenocarcinoma (PDA) remains a lethal malignancy despite tremendous progress in its molecular characterization. Indeed, PDA tumors harbor four signature somatic mutations1–4, and a plethora of lower frequency genetic events of uncertain significance5. Here, we used Sleeping Beauty (SB) transposon-mediated insertional mutagenesis6,7 in a mouse model of pancreatic ductal preneoplasia8 to identify genes that cooperate with oncogenic KrasG12D to accelerate tumorigenesis and promote progression. Our screen revealed new candidates and confirmed the importance of many genes and pathways previously implicated in human PDA. Interestingly, the most commonly mutated gene was the X-linked deubiquitinase Usp9x, which was inactivated in over 50% of the tumors. Although prior work had attributed a pro-survival role to USP9X in human neoplasia9, we found instead that loss of Usp9x enhances transformation and protects pancreatic cancer cells from anoikis. Clinically, low USP9X protein and mRNA expression in PDA correlates with poor survival following surgery, and USP9X levels are inversely associated with metastatic burden in advanced disease. Furthermore, chromatin modulation with trichostatin A or 5-aza-2′-deoxycytidine elevates USP9X expression in human PDA cell lines to suggest a clinical approach for certain patients. The conditional deletion of Usp9x cooperated with KrasG12D to rapidly accelerate pancreatic tumorigenesis in mice, validating their genetic interaction. Therefore, we propose USP9X as a major new tumor suppressor gene with prognostic and therapeutic relevance in PDA.
We recently proposed that competitive endogenous RNAs (ceRNAs) sequester microRNAs to regulate mRNA transcripts containing common microRNA recognition elements (MREs). However, the functional role of ceRNAs in cancer remains unknown. Loss of PTEN, a tumor suppressor regulated by ceRNA activity, frequently occurs in melanoma. Here, we report the discovery of significant enrichment of putative PTEN ceRNAs among genes whose loss accelerates tumorigenesis following Sleeping Beauty insertional mutagenesis in a mouse model of melanoma. We validated several putative PTEN ceRNAs and further characterized one, the ZEB2 transcript. We show that ZEB2 modulates PTEN protein levels in a microRNA-dependent, protein coding-independent manner. Attenuation of ZEB2 expression activates the PI3K/AKT pathway, enhances cell transformation, and commonly occurs in human melanomas and other cancers expressing low PTEN levels. Our study genetically identifies multiple putative microRNA decoys for PTEN, validates ZEB2 mRNA as a bona fide PTEN ceRNA, and demonstrates that abrogated ZEB2 expression cooperates with BRAFV600E to promote melanomagenesis.
Haploinsufficiency of the human 5q35 region spanning the NSD1 gene results in a rare genomic disorder known as Sotos syndrome (Sotos), with patients displaying a variety of clinical features, including pre- and postnatal overgrowth, intellectual disability, and urinary/renal abnormalities. We used chromosome engineering to generate a segmental monosomy, i.e., mice carrying a heterozygous 1.5-Mb deletion of 36 genes on mouse chromosome 13 (4732471D19Rik-B4galt7), syntenic with 5q35.2–q35.3 in humans (Df(13)Ms2Dja+/− mice). Surprisingly Df(13)Ms2Dja+/− mice were significantly smaller for their gestational age and also showed decreased postnatal growth, in contrast to Sotos patients. Df(13)Ms2Dja+/− mice did, however, display deficits in long-term memory retention and dilation of the pelvicalyceal system, which in part may model the learning difficulties and renal abnormalities observed in Sotos patients. Thus, haploinsufficiency of genes within the mouse 4732471D19Rik–B4galt7 deletion interval play important roles in growth, memory retention, and the development of the renal pelvicalyceal system.
Electronic supplementary material
The online version of this article (doi:10.1007/s00335-012-9416-0) contains supplementary material, which is available to authorized users.
The evolution of colorectal cancer suggests the involvement of many genes. We performed insertional mutagenesis with the Sleeping Beauty (SB) transposon system in mice carrying germline or somatic Apc mutation. Analysis of common insertion sites (CISs) isolated from 446 tumors revealed many hundreds of candidate cancer drivers. Comparison to human datasets suggested that 234 CIS genes are also deregulated in human colorectal cancers. 183 CIS genes are candidate Wnt targets, and 20 are shown to be novel modifiers of canonical Wnt signaling. We also identified gene mutations associated with a subset of tumors containing an expanded number of Paneth cells, a hallmark of deregulated Wnt signaling, and genes associated with more severe dysplasia included members of the FGF signaling cascade. Some 70 genes showed pairwise co-occurrence clustering into 38 sub-networks that may regulate tumor development.
CADM1 encodes an immunoglobulin superfamily (IGSF) cell adhesion molecule. Inactivation of CADM1, either by promoter hypermethylation or loss of heterozygosity, has been reported in a wide variety of tumor types, thus it has been postulated as a tumor suppressor gene.
We show for the first time that Cadm1 homozygous null mice die significantly faster than wildtype controls due to the spontaneous development of tumors at an earlier age and an increased tumor incidence of predominantly lymphomas, but also some solid tumors. Tumorigenesis was accelerated after irradiation of Cadm1 mice, with the reduced latency in tumor formation suggesting there are genes that collaborate with loss of Cadm1 in tumorigenesis. To identify these co-operating genetic events, we performed a Sleeping Beauty transposon-mediated insertional mutagenesis screen in Cadm1 mice, and identified several common insertion sites (CIS) found specifically on a Cadm1-null background (and not wildtype background).
We confirm that Cadm1 is indeed a bona fide tumor suppressor gene and provide new insights into genetic partners that co-operate in tumorigenesis when Cadm1-expression is lost.
Cell adhesion molecule; Tumor suppressor; Transposon; Glucocorticoid; Cell junction
Nuclear receptor binding protein 1 regulates intestinal progenitor cell homeostasis and tumour formation
Arising from a ras-interaction screen in C. elegans, nuclear receptor binding protein 1 (NRBP1) is shown to impose a crypt progenitor phenotype in mice and is proposed as a novel tumour suppressor in human cancer.
Genetic screens in simple model organisms have identified many of the key components of the conserved signal transduction pathways that are oncogenic when misregulated. Here, we identify H37N21.1 as a gene that regulates vulval induction in let-60(n1046gf), a strain with a gain-of-function mutation in the Caenorhabditis elegans Ras orthologue, and show that somatic deletion of Nrbp1, the mouse orthologue of this gene, results in an intestinal progenitor cell phenotype that leads to profound changes in the proliferation and differentiation of all intestinal cell lineages. We show that Nrbp1 interacts with key components of the ubiquitination machinery and that loss of Nrbp1 in the intestine results in the accumulation of Sall4, a key mediator of stem cell fate, and of Tsc22d2. We also reveal that somatic loss of Nrbp1 results in tumourigenesis, with haematological and intestinal tumours predominating, and that nuclear receptor binding protein 1 (NRBP1) is downregulated in a range of human tumours, where low expression correlates with a poor prognosis. Thus NRBP1 is a conserved regulator of cell fate, that plays an important role in tumour suppression.
intestine; progenitor cell; Ras; tumour suppressor gene; WNT
While genomic alterations identified in human tumors using techniques such as comparative genomic hybridisation (CGH) may be recurrent, they frequently encompass large regions, in some cases containing hundreds of genes. Here we combine high-resolution CGH analysis of 598 human cancer cell lines with insertion sites isolated from 1,005 mouse tumors induced with the Murine Leukaemia Virus (MuLV). This cross-species oncogenomic analysis revealed candidate tumor suppressor genes and oncogenes recurrently mutated in both human and mouse tumors, making them strong candidate cancer genes. A significant number of these genes contained binding sites for the transcription factors Oct4 and Nanog and mice carrying tumors with insertions in or near stem cell module genes, genes that are thought to participate in self-renewal, died significantly faster than mice without these insertions. The profile of MuLV insertions that we identified was compared to insertions isolated from 73 tumors induced using the Sleeping Beauty (SB) transposon system revealing significant differences in the profile of recurrently mutated genes. Collectively this work provides a rich catalogue of candidate genes for follow-up functional analysis.
Cross-species analysis; insertional mutagenesis; bioinformatics; oncogenomics; comparative genomic hybridization
The innate immune system is a two-edged sword; it is absolutely required for host defense against infection but, uncontrolled, can trigger a plethora of inflammatory diseases. Here we used systems biology approaches to predict and validate a gene regulatory network involving a dynamic interplay between the transcription factors NF-κB, C/EBPδ, and ATF3 that controls inflammatory responses. We mathematically modeled transcriptional regulation of Il6 and Cebpd genes and experimentally validated the prediction that the combination of an initiator (NF-κB), an amplifier (C/EBPδ) and an attenuator (ATF3) forms a regulatory circuit that discriminates between transient and persistent Toll-like receptor 4-induced signals. Our results suggest a mechanism that enables the innate immune system to detect the duration of infection and to respond appropriately.
Methods for accurate identification of nucleotide and structural variation using de novo short read sequencing of mouse chromosomes are described.
Genome sequences are essential tools for comparative and mutational analyses. Here we present the short read sequence of mouse chromosome 17 from the Mus musculus domesticus derived strain A/J, and the Mus musculus castaneus derived strain CAST/Ei. We describe approaches for the accurate identification of nucleotide and structural variation in the genomes of vertebrate experimental organisms, and show how these techniques can be applied to help prioritize candidate genes within quantitative trait loci.
An important problem in molecular biology is to build a complete understanding of transcriptional regulatory processes in the cell. We have developed a flexible, probabilistic framework to predict TF binding from multiple data sources that differs from the standard hypothesis testing (scanning) methods in several ways. Our probabilistic modeling framework estimates the probability of binding and, thus, naturally reflects our degree of belief in binding. Probabilistic modeling also allows for easy and systematic integration of our binding predictions into other probabilistic modeling methods, such as expression-based gene network inference. The method answers the question of whether the whole analyzed promoter has a binding site, but can also be extended to estimate the binding probability at each nucleotide position. Further, we introduce an extension to model combinatorial regulation by several TFs. Most importantly, the proposed methods can make principled probabilistic inference from multiple evidence sources, such as, multiple statistical models (motifs) of the TFs, evolutionary conservation, regulatory potential, CpG islands, nucleosome positioning, DNase hypersensitive sites, ChIP-chip binding segments and other (prior) sequence-based biological knowledge. We developed both a likelihood and a Bayesian method, where the latter is implemented with a Markov chain Monte Carlo algorithm. Results on a carefully constructed test set from the mouse genome demonstrate that principled data fusion can significantly improve the performance of TF binding prediction methods. We also applied the probabilistic modeling framework to all promoters in the mouse genome and the results indicate a sparse connectivity between transcriptional regulators and their target promoters. To facilitate analysis of other sequences and additional data, we have developed an on-line web tool, ProbTF, which implements our probabilistic TF binding prediction method using multiple data sources. Test data set, a web tool, source codes and supplementary data are available at: http://www.probtf.org.
Mouse models of cancer represent powerful tools for analysing the role of genetic alterations in carcinogenesis. Using a mouse model that allows tamoxifen-inducible somatic activation (by Cre-mediated recombination) of oncogenic K-rasG12D in a wide range of tissues, we observed hyperplasia of squamous epithelium located in moist or frequently abraded mucosa, with the most dramatic effects in the oral mucosa. This epithelium showed a sequence of squamous hyperplasia followed by squamous papilloma with dysplasia, in which some areas progressed to early invasive squamous cell carcinoma, within 14 days of widespread oncogenic K-ras activation. The marked proliferative response of the oral mucosa to K-rasG12D was most evident in the basal layers of the squamous epithelium of the outer lip with hair follicles and wet mucosal surface, with these cells staining positively for pAKT and cyclin D1, showing Ras/AKT pathway activation and increased proliferation with Ki-67 and EdU positivity. The stromal cells also showed gene activation by recombination and immunopositivity for pERK indicating K-Ras/ERK pathway activation, but without Ki-67 positivity or increase in stromal proliferation. The oral neoplasms showed changes in the expression pattern of cytokeratins (CK6 and CK13), similar to those observed in human oral tumours. Sporadic activation of the K-rasG12D allele (due to background spontaneous recombination in occasional cells) resulted in the development of benign oral squamous papillomas only showing a mild degree of dysplasia with no invasion. In summary, we show that oral mucosa is acutely sensitive to oncogenic K-ras, as widespread expression of activated K-ras in the murine oral mucosal squamous epithelium and underlying stroma can drive the oral squamous papilloma–carcinoma sequence. Copyright © 2011 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
oral mucosa; hyperplasia; carcinoma; K-ras; pAKT; pERK; cyclin D1
Macrophages are versatile immune cells that can detect a variety of pathogen-associated molecular patterns through their Toll-like receptors (TLRs). In response to microbial challenge, the TLR-stimulated macrophage undergoes an activation program controlled by a dynamically inducible transcriptional regulatory network. Mapping a complex mammalian transcriptional network poses significant challenges and requires the integration of multiple experimental data types. In this work, we inferred a transcriptional network underlying TLR-stimulated murine macrophage activation. Microarray-based expression profiling and transcription factor binding site motif scanning were used to infer a network of associations between transcription factor genes and clusters of co-expressed target genes. The time-lagged correlation was used to analyze temporal expression data in order to identify potential causal influences in the network. A novel statistical test was developed to assess the significance of the time-lagged correlation. Several associations in the resulting inferred network were validated using targeted ChIP-on-chip experiments. The network incorporates known regulators and gives insight into the transcriptional control of macrophage activation. Our analysis identified a novel regulator (TGIF1) that may have a role in macrophage activation.
Macrophages play a vital role in host defense against infection by recognizing pathogens through pattern recognition receptors, such as the Toll-like receptors (TLRs), and mounting an immune response. Stimulation of TLRs initiates a complex transcriptional program in which induced transcription factor genes dynamically regulate downstream genes. Microarray-based transcriptional profiling has proved useful for mapping such transcriptional programs in simpler model organisms; however, mammalian systems present difficulties such as post-translational regulation of transcription factors, combinatorial gene regulation, and a paucity of available gene-knockout expression data. Additional evidence sources, such as DNA sequence-based identification of transcription factor binding sites, are needed. In this work, we computationally inferred a transcriptional network for TLR-stimulated murine macrophages. Our approach combined sequence scanning with time-course expression data in a probabilistic framework. Expression data were analyzed using the time-lagged correlation. A novel, unbiased method was developed to assess the significance of the time-lagged correlation. The inferred network of associations between transcription factor genes and co-expressed gene clusters was validated with targeted ChIP-on-chip experiments, and yielded insights into the macrophage activation program, including a potential novel regulator. Our general approach could be used to analyze other complex mammalian systems for which time-course expression data are available.