UHRF1 is an essential regulator of DNA methylation that is highly expressed in many cancers. Here, we use transgenic zebrafish, cultured cells and human tumors to demonstrate that UHRF1 is an oncogene. UHRF1 overexpression in zebrafish hepatocytes destabilizes and delocalizes DNMT1, causes DNA hypomethylation and Tp53-mediated senescence. Hepatocellular carcinoma (HCC) emerges when senescence is bypassed. tp53 mutation both alleviates senescence and accelerates tumor onset. Human HCCs recapitulate this paradigm, as UHRF1 overexpression defines a subclass of aggressive HCCs characterized by genomic instability, TP53 mutation and abrogation of the TP53-mediated senescence program. We propose that UHRF1 overexpression is a mechanism underlying DNA hypomethylation in cancer cells and that senescence is a primary means of restricting tumorigenesis due to epigenetic disruption.
The p53 tumor suppressor protein is a major sensor of cellular stresses, and upon stabilization, activates or represses many genes that control cell fate decisions. While the mechanism of p53-mediated transactivation is well established, several mechanisms have been proposed for p53-mediated repression. Here, we demonstrate that the CDK inhibitor p21 is both necessary and sufficient for the downregulation of known p53-repression targets, including survivin, CDC25C and CDC25B in response to p53 induction. These same targets are similarly repressed in response to p16 overexpression, implicating the involvement of the shared downstream retinoblastoma (RB)-E2F pathway. We further show that in response to either p53 or p21 induction, E2F4 complexes are specifically recruited onto the promoters of these p53 repression targets. Moreover, abrogation of E2F4 recruitment via the inactivation of RB pocket proteins, but not by RB loss of function alone, prevents the repression of these genes. Finally, our results indicate that E2F4 promoter occupancy is globally associated with p53 repression targets, but not with p53 activation targets, implicating E2F4 complexes as effectors of p21 dependent p53-mediated repression.
p53; p21; E2F4; RB; p130; transcriptional repression
Bacillus alcalophilus AV1934, isolated from human feces, was described in 1934 before microbiome studies and recent indications of novel potassium ion coupling to motility in this extremophile. Here, we report draft sequences that will facilitate an examination of whether that coupling is part of a larger cycle of potassium ion-coupled transporters.
Argonaute proteins and their small RNA co-factors short interfering RNAs (siRNAs) are known to inhibit gene expression at the transcriptional and post-transcriptional levels. In Caenorhabditis elegans, the Argonaute CSR-1 binds thousands of endogenous siRNAs (endo-siRNAs) antisense to germline transcripts and associates with chromatin in a siRNA-dependent manner. However, its role in gene expression regulation remains controversial. Here, we used a genome-wide profiling of nascent RNA transcripts to demonstrate that the CSR-1 RNAi pathway promotes sense-oriented Pol II transcription. Moreover, a loss of CSR-1 function resulted in global increase in antisense transcription and ectopic transcription of silent chromatin domains, which led to reduced chromatin incorporation of centromere-specific histone H3. Based on these findings, we propose that the CSR-1 pathway has a role in maintaining the directionality of active transcription thereby propagating the distinction between transcriptionally active and silent genomic regions.
Eukaryotic Argonautes bind small RNAs and use them as guides to find complementary RNA targets and induce gene silencing. Though homologs of eukaryotic Argonautes are present in many bacteria and archaea their small RNA partners and functions are unknown. We found that the Argonaute of Rhodobacter sphaeroides (RsAgo) associates with 15-19 nt RNAs that correspond to the majority of transcripts. RsAgo also binds single-stranded 22-24 nt DNA molecules that are complementary to the small RNAs and enriched in sequences derived from exogenous plasmids as well as genome-encoded foreign nucleic acids such as transposons and phage genes. Expression of RsAgo in the heterologous E. coli system leads to formation of plasmid– derived small RNA and DNA and plasmid degradation. In a R. sphaeroides mutant lacking RsAgo, expression of plasmid-encoded genes is elevated. Our results indicate that RNAi-related processes found in eukaryotes are also conserved in bacteria and target foreign nucleic acids.
Fatty liver disease (FLD) is characterized by lipid accumulation in hepatocytes and is accompanied by secretory pathway dysfunction, resulting in induction of the unfolded protein response (UPR). Activating transcription factor 6 (ATF6), one of three main UPR sensors, functions to both promote FLD during acute stress and reduce FLD during chronic stress. There is little mechanistic understanding of how ATF6, or any other UPR factor, regulates hepatic lipid metabolism to cause disease. We addressed this using zebrafish genetics and biochemical analyses and demonstrate that Atf6 is necessary and sufficient for FLD. atf6 transcription is significantly upregulated in the liver of zebrafish with alcoholic FLD and morpholino-mediated atf6 depletion significantly reduced steatosis incidence caused by alcohol. Moreover, overexpression of active, nuclear Atf6 (nAtf6) in hepatocytes caused FLD in the absence of stress. mRNA-Seq and qPCR analyses of livers from five day old nAtf6 transgenic larvae revealed upregulation of genes promoting glyceroneogenesis and fatty acid elongation, including fatty acid synthase (fasn), and nAtf6 overexpression in both zebrafish larvae and human hepatoma cells increased the incorporation of 14C-acetate into lipids. Srebp transcription factors are key regulators of lipogenic enzymes, but reducing Srebp activation by scap morpholino injection neither prevented FLD in nAtf6 transgenics nor synergized with atf6 knockdown to reduce alcohol-induced FLD. In contrast, fasn morpholino injection reduced FLD in nAtf6 transgenic larvae and synergistically interacted with atf6 to reduce alcoholic FLD. Thus, our data demonstrate that Atf6 is required for alcoholic FLD and epistatically interacts with fasn to cause this disease, suggesting triglyceride biogenesis as the mechanism of UPR induced FLD.
Fatty liver disease (steatosis) is the most common liver disease worldwide and is commonly caused by obesity, type 2 diabetes, or alcohol abuse. All of these conditions are associated with impaired hepatocyte protein secretion, resulting in hypoproteinemia that contributes to the systemic complications of these diseases. The unfolded protein response (UPR) is activated in response to stress in the protein secretory pathway and a wealth of data indicates that UPR activation can contribute to steatosis, but the mechanistic basis for this relationship is poorly understood. We identify activating transcription factor 6 (Atf6), one of three UPR sensors, as necessary and sufficient for steatosis and show that Atf6 activation can promote lipogenesis, providing a direct connection between the stress response and lipid metabolism. Blocking Atf6 in zebrafish larvae prevents alcohol-induced steatosis and Atf6 overexpression in zebrafish hepatocytes induces genes that drive lipogenesis, increases lipid production and causes steatosis. Fatty acid synthase (fasn) is a key lipogenic enzyme and we show that fasn is required for fatty liver in response to both ethanol and Atf6 overexpression. Our findings point to Atf6 as a potential therapeutic target for fatty liver disease.
Congenital heart disease (CHD) is the most frequent birth defect, affecting 0.8% of live births1. Many cases occur sporadically and impair reproductive fitness, suggesting a role for de novo mutations. By analysis of exome sequencing of parent-offspring trios, we compared the incidence of de novo mutations in 362 severe CHD cases and 264 controls. CHD cases showed a significant excess of protein-altering de novo mutations in genes expressed in the developing heart, with an odds ratio of 7.5 for damaging mutations. Similar odds ratios were seen across major classes of severe CHD. We found a marked excess of de novo mutations in genes involved in production, removal or reading of H3K4 methylation (H3K4me), or ubiquitination of H2BK120, which is required for H3K4 methylation2–4. There were also two de novo mutations in SMAD2; SMAD2 signaling in the embryonic left-right organizer induces demethylation of H3K27me5. H3K4me and H3K27me mark `poised' promoters and enhancers that regulate expression of key developmental genes6. These findings implicate de novo point mutations in several hundred genes that collectively contribute to ~10% of severe CHD.
Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms.
MiST is a novel approach to variant calling from deep sequencing data, using the inverted mapping approach developed for Geoseq. Reads that can map to a targeted exonic region are identified using exact matches to tiles from the region. The reads are then aligned to the targets to discover variants. MiST carefully handles paralogous reads that map ambiguously to the genome and clonal reads arising from PCR bias, which are the two major sources of errors in variant calling. The reduced computational complexity of mapping selected reads to targeted regions of the genome improves speed, specificity and sensitivity of variant detection. Compared with variant calls from the GATK platform, MiST showed better concordance with SNPs from dbSNP and genotypes determined by an exonic-SNP array. Variant calls made only by MiST confirm at a high rate (>90%) by Sanger sequencing. Thus, MiST is a valuable alternative tool to analyse variants in deep sequencing data.
Influenza A virus (IAV) is an unremitting virus that results in significant morbidity and mortality worldwide. Key to the viral life cycle is the RNA-dependent RNA polymerase (RdRp), a heterotrimeric complex responsible for both transcription and replication of the segmented genome. Here, we demonstrate that the viral polymerase utilizes a small RNA enhancer to regulate enzymatic activity and maintain stoichiometric balance of the viral genome. We demonstrate that IAV synthesizes small viral RNAs (svRNAs) that interact with the viral RdRp in order to promote genome replication in a segment-specific manner. svRNAs localize to the nucleus, the site of IAV replication, are synthesized from the positive-sense genomic intermediate, and interact within a novel RNA binding channel of the polymerase PA subunit. Synthetic svRNAs promote polymerase activity in vitro, while loss of svRNA inhibits viral RNA synthesis in a segment-specific manner. Taking these observations together, we mechanistically define svRNA as a small regulatory enhancer RNA, which functions to promote genome replication and maintain segment balance through allosteric modulation of polymerase activity.
We introduce two large-scale resources for functional analysis of microRNA—a decoy/sponge library for inhibiting microRNA function and a sensor library for monitoring microRNA activity. To take advantage of the sensor library, we developed a high-throughput assay called Sensor-seq, which permits the activity of hundreds of microRNAs to be quantified simultaneously. Using this approach, we show that only the most abundant microRNAs within a cell mediate significant target suppression. Over 60% of detected microRNAs had no discernible activity, indicating that the functional ‘miRNome’ of a cell is considerably smaller than currently inferred from profiling studies. Moreover, some highly expressed microRNAs exhibit relatively weak activity, which in some cases correlated with a high target-to-microRNA ratio or increased nuclear localization of the microRNA. Finally, we show that the microRNA decoy library can be used for pooled loss-of-function studies. These tools provide valuable resources for studying microRNA biology and for microRNA-based therapeutics.
In animal gonads, PIWI proteins and their bound 23–30 nt piRNAs guard genome integrity by the sequence specific silencing of transposons. Two branches of piRNA biogenesis, namely primary processing and ping-pong amplification, have been proposed. Despite an overall conceptual understanding of piRNA biogenesis, identity and/or function of the involved players are largely unknown. Here, we demonstrate an essential role for the female sterility gene shutdown in piRNA biology. Shutdown, an evolutionarily conserved cochaperone collaborates with Hsp90 during piRNA biogenesis, potentially at the loading step of RNAs into PIWI proteins. We demonstrate that Shutdown is essential for both primary and secondary piRNA populations in Drosophila. An extension of our study to previously described piRNA pathway members revealed three distinct groups of biogenesis factors. Together with data on how PIWI proteins are wired into primary and secondary processing, we propose a unified model for piRNA biogenesis.
► The cochaperone Shutdown is an essential piRNA biogenesis factor ► Primary and secondary piRNA biogenesis feed into a common biogenesis step ► Piwi and Aub, but not AGO3 are loaded with primary piRNAs
Considerable details about microRNA (miRNA) biogenesis and regulation have been uncovered, however, little is known about the fate of the miRNA subsequent to target regulation. To gain insight into this process, we carried out kinetic analysis of a miRNA’s turnover following termination of its biogenesis, and during regulation of a target that is not subject to Ago2-mediated catalytic cleavage. By quantitating the number of molecules of the miRNA and its target in steady-state, and in the course of its decay, we found that each miRNA molecule was able to regulate at least 2 target transcripts, providing in vivo evidence that the miRNA is not irreversibly sequestered with its target, and that the non-slicing pathway of miRNA regulation is multiple-turnover. Using deep-sequencing, we further show that miRNA recycling is limited by target regulation, which promotes post-transcriptional modifications to the 3′ end of the miRNA, and accelerates the miRNA’s rate of decay. These studies provide new insight into the efficiency of miRNA regulation, which help to explain how a miRNA can regulate a vast number of transcripts, and identify one of the mechanisms that impart specificity to miRNA decay in mammalian cells.
Protecting the genome from transposable element (TE) mobilization is critical for germline development. In Drosophila, Piwi proteins and their bound small RNAs (piRNAs) provide a potent defense against TE activity. TE targeting piRNAs are processed from TE-dense heterochromatic loci termed ‘piRNA clusters’. While piRNA biogenesis from cluster precursors is beginning to be understood, little is known about piRNA cluster transcriptional regulation. Here we show that deposition of histone 3 lysine 9 by the methyltransferase dSETDB1 (egg) is required for piRNA cluster transcription. In the absence of dSETDB1, cluster precursor transcription collapses in germline and somatic gonadal cells and TEs are activated, resulting in germline loss and a block in germline stem cell differentiation. We propose that heterochromatin protects the germline by activating the piRNA pathway.
RNA-Seq allows a theoretically unbiased analysis of both genome-wide transcription levels and mutation status of a tumor. Using this technique we sought to identify novel candidate therapeutic targets expressed in epithelial ovarian cancer (EOC).
Specifically, we sought candidate invasion/migration targets based on expression levels across all tumors, novelty of expression in EOC, and known function. RNA-Seq analysis revealed the high expression of CD151, a transmembrane protein, across all stages of EOC. Expression was confirmed at both the mRNA and protein levels using RT-PCR and immunohistochemical staining, respectively.
In both EOC tumors and normal ovarian surface epithelial cells we demonstrated CD151 to be localized to the membrane and cell-cell junctions in patient-derived and established EOC cell lines. We next evaluated its role in EOC dissemination using two ovarian cancer-derived cell lines with differential levels of CD151 expression. Targeted antibody-mediated and siRNA inhibition or loss of CD151 in SKOV3 and OVCAR5 cell lines effectively inhibited their migration and invasion.
Taken together, these findings provide the first proof-of-principle demonstration for a next generation sequencing approach to identifying candidate therapeutic targets and reveal CD151 to play a role in EOC dissemination.
CD151; Epithelial Ovarian Cancer; Invasion; Migration; Metastasis; RNA-Seq
We sought to identify candidate serum biomarkers for the detection and surveillance of EOC. Based on RNA-Seq transcriptome analysis of patient-derived tumors, highly expressed secreted proteins were identified using a bioinformatic approach.
RNA-Seq was used to quantify papillary serous ovarian cancer transcriptomes. Paired end sequencing of 22 flash frozen tumors was performed. Sequence alignments were processed with the program ELAND, expression levels with ERANGE and then bioinformatically screened for secreted protein signatures. Serum samples from women with benign and malignant pelvic masses and serial samples from women during chemotherapy regimens were measured for IGFBP-4 by ELISA. Student's t Test, ANOVA, and ROC curves were used for statistical analysis.
Insulin-like growth factor binding protein (IGFBP-4) was consistently present in the top 7.5% of all expressed genes in all tumor samples. We then screened serum samples to determine if increased tumor expression correlated with serum expression. In an initial discovery set of 21 samples, IGFBP-4 levels were found to be elevated in patients, including those with early stage disease and normal CA125 levels. In a larger and independent validation set (82 controls, 78 cases), IGFBP-4 levels were significantly increased (p < 5 × 10-5). IGFBP-4 levels were ~3× greater in women with malignant pelvic masses compared to women with benign masses. ROC sensitivity was 73% at 93% specificity (AUC 0.816). In women receiving chemotherapy, average IGFBP-4 levels were below the ROC-determined threshold and lower in NED patients compared to AWD patients.
This study, the first to our knowledge to use RNA-Seq for biomarker discovery, identified IGFBP-4 as overexpressed in ovarian cancer patients. Beyond this, these studies identified two additional intriguing findings. First, IGFBP-4 can be elevated in early stage disease without elevated CA125. Second, IGFBP-4 levels are significantly elevated with malignant versus benign disease. These findings provide the rationale for future validation studies.
IGFBP-4; epithelial ovarian cancer; serum biomarker; RNA-Seq; transcriptome
In Drosophila, Piwi proteins associate with Piwi-interacting RNAs (piRNAs) and protect the germline genome by silencing mobile genetic elements. This defense system acts in germline and gonadal somatic tissue to preserve germline development. Genetic control for these silencing pathways varies greatly between tissues of the gonad. Here, we identified Vreteno (Vret), a novel gonad-specific protein essential for germline development. Vret is required for piRNA-based transposon regulation in both germline and somatic gonadal tissues. We show that Vret, which contains Tudor domains, associates physically with Piwi and Aubergine (Aub), stabilizing these proteins via a gonad-specific mechanism that is absent in other fly tissues. In the absence of vret, Piwi-bound piRNAs are lost without changes in piRNA precursor transcript production, supporting a role for Vret in primary piRNA biogenesis. In the germline, piRNAs can engage in an Aub- and Argonaute 3 (AGO3)-dependent amplification in the absence of Vret, suggesting that Vret function can distinguish between primary piRNAs loaded into Piwi-Aub complexes and piRNAs engaged in the amplification cycle. We propose that Vret plays an essential role in transposon regulation at an early stage of primary piRNA processing.
Germline stem cell; Soma; Transposon; Piwi; Aubergine; piRNAs; Tudor; Drosophila
Deep sequencing of small RNAs (sRNA-seq) is now the gold standard for small RNA profiling and discovery. Biases in sRNA-seq have been reported, but their etiology remains unidentified. Through a comprehensive series of sRNA-seq experiments, we establish that the predominant cause of the bias is the RNA ligases. We further demonstrate that RNA ligases have strong sequence-specific biases which distort the small RNA profiles considerably. We have devised a pooled adapter strategy to overcome this bias, and validated the method through data derived from microarray and qPCR. In light of our findings, published small RNA profiles, as well as barcoding strategies using adapter-end modifications, may need to be revisited. Importantly, by providing a wide spectrum of substrate for the ligase, the pooled-adapter strategy developed here provides a means to overcome issues of bias, and generate more accurate small RNA profiles.
Pseudogenes populate the mammalian genome as remnants of artefactual incorporation of coding messenger RNAs into transposon pathways1. Here we show that a subset of pseudogenes generates endogenous small interfering RNAs (endo-siRNAs) in mouse oocytes. These endo-siRNAs are often processed from double-stranded RNAs formed by hybridization of spliced transcripts from protein-coding genes to antisense transcripts from homologous pseudogenes. An inverted repeat pseudogene can also generate abundant small RNAs directly. A second class of endo-siRNAs may enforce repression of mobile genetic elements, acting together with Piwi-interacting RNAs. Loss of Dicer, a protein integral to small RNA production, increases expression of endo-siRNA targets, demonstrating their regulatory activity. Our findings indicate a function for pseudogenes in regulating gene expression by means of the RNA interference pathway and may, in part, explain the evolutionary pressure to conserve argonaute-mediated catalysis in mammals.
Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest.
Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment.
Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.
Drosophila endogenous small RNAs are categorized according to their mechanisms of biogenesis and the Argonaute protein to which they bind. MicroRNAs are a class of ubiquitously expressed RNAs of ~22 nucleotides in length, which arise from structured precursors through the action of Drosha–Pasha and Dicer-1–Loquacious complexes1–7. These join Argonaute-1 to regulate gene expression8,9. A second endogenous small RNA class, the Piwi-interacting RNAs, bind Piwi proteins and suppress transposons10,11. Piwi-interacting RNAs are restricted to the gonad, and at least a subset of these arises by Piwi-catalysed cleavage of single-stranded RNAs12,13. Here we show that Drosophila generates a third small RNA class, endogenous small interfering RNAs, in both gonadal and somatic tissues. Production of these RNAs requires Dicer-2, but a subset depends preferentially on Loquacious1,4,5 rather than the canonical Dicer-2 partner, R2D2 (ref. 14). Endogenous small interfering RNAs arise both from convergent transcription units and from structured genomic loci in a tissue-specific fashion. They predominantly join Argonaute-2 and have the capacity, as a class, to target both protein-coding genes and mobile elements. These observations expand the repertoire of small RNAs in Drosophila, adding a class that blurs distinctions based on known biogenesis mechanisms and functional roles.
In Drosophila gonads, Piwi proteins and associated piRNAs collaborate with additional factors to form a small RNA-based immune system that silences mobile elements. Here, we analyzed nine Drosophila piRNA pathway mutants for their impacts on both small RNA populations and the subcellular localization patterns of Piwi proteins. We find that distinct piRNA pathways with differing components function in ovarian germ and somatic cells. In the soma, Piwi acts singularly with the conserved flamenco piRNA cluster to enforce silencing of retroviral elements that may propagate by infecting neighboring germ cells. In the germline, silencing programs encoded within piRNA clusters are optimized via a slicer-dependent amplification loop to suppress a broad spectrum of elements. The classes of transposons targeted by germline and somatic piRNA clusters, though not the precise elements, are conserved among Drosophilids, demonstrating that the architecture of piRNA clusters has coevolved with the transposons that they are tasked to control.
In plants and mammals, small RNAs indirectly mediate epigenetic inheritance by specifying cytosine methylation. We found that small RNAs themselves serve as vectors for epigenetic information. Crosses between Drosophila strains that differ in the presence of a particular transposon can produce sterile progeny, a phenomenon called hybrid dysgenesis. This phenotype manifests itself only if the transposon is paternally inherited, suggesting maternal transmission of a factor that maintains fertility. In both P- and I-element–mediated hybrid dysgenesis models, daughters show a markedly different content of Piwi-interacting RNAs (piRNAs) targeting each element, depending on their parents of origin. Such differences persist from fertilization through adulthood. This indicates that maternally deposited piRNAs are important for mounting an effective silencing response and that a lack of maternal piRNA inheritance underlies hybrid dysgenesis.
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.