Anopheles darlingi is the principal neotropical malaria vector, responsible for more than a million cases of malaria per year on the American continent. Anopheles darlingi diverged from the African and Asian malaria vectors ∼100 million years ago (mya) and successfully adapted to the New World environment. Here we present an annotated reference A. darlingi genome, sequenced from a wild population of males and females collected in the Brazilian Amazon. A total of 10 481 predicted protein-coding genes were annotated, 72% of which have their closest counterpart in Anopheles gambiae and 21% have highest similarity with other mosquito species. In spite of a long period of divergent evolution, conserved gene synteny was observed between A. darlingi and A. gambiae. More than 10 million single nucleotide polymorphisms and short indels with potential use as genetic markers were identified. Transposable elements correspond to 2.3% of the A. darlingi genome. Genes associated with hematophagy, immunity and insecticide resistance, directly involved in vector–human and vector–parasite interactions, were identified and discussed. This study represents the first effort to sequence the genome of a neotropical malaria vector, and opens a new window through which we can contemplate the evolutionary history of anopheline mosquitoes. It also provides valuable information that may lead to novel strategies to reduce malaria transmission on the South American continent. The A. darlingi genome is accessible at www.labinfo.lncc.br/index.php/anopheles-darlingi.
CRISPR-Cas systems are RNA-guided immune systems that protect prokaryotes against viruses and other invaders. The CRISPR locus encodes crRNAs that recognize invading nucleic acid sequences and trigger silencing by the associated Cas proteins. There are multiple CRISPR-Cas systems with distinct compositions and mechanistic processes. Thermococcus kodakarensis (Tko) is a hyperthermophilic euryarchaeon that has both a Type I-A Csa and a Type I-B Cst CRISPR-Cas system. We have analyzed the expression and composition of crRNAs from the three CRISPRs in Tko by RNA deep sequencing and northern analysis. Our results indicate that crRNAs associated with these two CRISPR-Cas systems include an 8-nucleotide conserved sequence tag at the 5′ end. We challenged Tko with plasmid invaders containing sequences targeted by endogenous crRNAs and observed active CRISPR-Cas-mediated silencing. Plasmid silencing was dependent on complementarity with a crRNA as well as on a sequence element found immediately adjacent to the crRNA recognition site in the target termed the PAM (protospacer adjacent motif). Silencing occurred independently of the orientation of the target sequence in the plasmid, and appears to occur at the DNA level, presumably via DNA degradation. In addition, we have directed silencing of an invader plasmid by genetically engineering the chromosomal CRISPR locus to express customized crRNAs directed against the plasmid. Our results support CRISPR engineering as a feasible approach to develop prokaryotic strains that are resistant to infection for use in industry.
CRISPR; Cas; archaea; Thermococcus; hyperthermophile; immune; RNA; DNA; silencing; interference
Small RNAs target invaders for silencing in the CRISPR-Cas pathways that protect bacteria and archaea from viruses and plasmids. The CRISPR RNAs (crRNAs) contain sequence elements acquired from invaders that guide CRISPR-associated (Cas) proteins back to the complementary invading DNA or RNA. Here, we have analyzed essential features of the crRNAs associated with the Cas RAMP module (Cmr) effector complex, which cleaves targeted RNAs. We show that Cmr crRNAs contain an 8-nucleotide 5’ sequence tag (also found on crRNAs associated with other CRISPR-Cas pathways) that is critical for crRNA function and can be used to engineer crRNAs that direct cleavage of novel targets. We also present data that indicates that the Cmr complex cleaves an endogenous complementary RNA in Pyrococcus furiosus, providing direct in vivo evidence of RNA targeting by the CRISPR-Cas system. Our findings indicate that the CRISPR RNA-Cmr protein pathway may be exploited to cleave RNAs of interest.
Genomic imprinting occurs when expression of an allele differs based on the sex of the parent that transmitted the allele. In D. melanogaster, imprinting can occur, but its impact on allelic expression genome-wide is unclear. Here, we search for imprinted genes in D. melanogaster using RNA-seq to compare allele-specific expression between pools of 7–10 day old adult female progeny from reciprocal crosses. 119 genes with allelic expression patterns consistent with imprinting were identified and showed significant clustering within the genome. Surprisingly, additional analysis of several of these genes showed that either genomic heterogeneity or high levels of intrinsic noise caused imprinting-like allelic expression. Consequently, our data provide no convincing evidence of imprinting for D. melanogaster genes in their native genomic context. Elucidating sources of false positive signals for imprinting in allele-specific RNA-seq data, as done here, is critical given the growing popularity of this method for identifying imprinted genes.
The collection of components required to carry out the intricate processes involved in generating and maintaining a living, breathing and, sometimes, thinking organism is staggeringly complex. Where do all of the parts come from? Early estimates stated that about 100,000 genes would be required to make up a mammal; however, the actual number is less than one-quarter of that, barely four times the number of genes in budding yeast. It is now clear that the ‘missing’ information is in large part provided by alternative splicing, the process by which multiple different functional messenger RNAs, and therefore proteins, can be synthesized from a single gene.
Alternative splicing is a widespread means of increasing protein diversity and regulating gene expression in eukaryotes. Much progress has been made in understanding the proteins involved in regulating alternative splicing, the sequences they bind to, and how these interactions lead to changes in splicing patterns. However, several recent studies have identified other players involved in regulating alternative splicing. A major theme emerging from these studies is that RNA secondary structures play an under appreciated role in the regulation of alternative splicing. This review provides and overview of the basic aspects of splicing regulation and highlights recent progress in understanding the role of RNA secondary structure in this process.
We analyzed the usage and consequences of alternative cleavage and polyadenylation (APA) in Drosophila melanogaster by using >1 billion reads of stranded mRNA-seq across a variety of dissected tissues. Beyond demonstrating that a majority of fly transcripts are subject to APA, we observed broad trends for 3′ untranslated region (UTR) shortening in the testis and lengthening in the central nervous system (CNS); the latter included hundreds of unannotated extensions ranging up to 18 kb. Extensive northern analyses validated the accumulation of full-length neural extended transcripts, and in situ hybridization indicated their spatial restriction to the CNS. Genes encoding RNA binding proteins (RBPs) and transcription factors were preferentially subject to 3′ UTR extensions. Motif analysis indicated enrichment of miRNA and RBP sites in the neural extensions, and their termini were enriched in canonical cis elements that promote cleavage and polyadenylation. Altogether, we reveal broad tissue-specific patterns of APA in Drosophila and transcripts with unprecedented 3′ UTR length in the nervous system.
Drosophila melanogaster is one of the most well studied genetic model organisms, nonetheless its genome still contains unannotated coding and non-coding genes, transcripts, exons, and RNA editing sites. Full discovery and annotation are prerequisites for understanding how the regulation of transcription, splicing, and RNA editing directs development of this complex organism. We used RNA-Seq, tiling microarrays, and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches. Together, these data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development.
Lynch syndrome (LS) leads to an increased risk of early-onset colorectal and other types of cancer and is caused by germline mutations in DNA mismatch repair (MMR) genes. Loss of MMR function results in a mutator phenotype that likely underlies its role in tumorigenesis. However, loss of MMR also results in the elimination of a DNA damage-induced checkpoint/apoptosis activation barrier that may allow damaged cells to grow unchecked. A fundamental question is whether loss of MMR provides pre-cancerous stem cells an immediate selective advantage in addition to establishing a mutator phenotype. To test this hypothesis in an in vivo system, we utilized the planarian Schmidtea mediterranea which contains a significant population of identifiable adult stem cells. We identified a planarian homolog of human MSH2, a MMR gene which is mutated in 38% of LS cases. The planarian Smed-msh2 is expressed in stem cells and some progeny. We depleted Smed-msh2 mRNA levels by RNA-interference and found a striking survival advantage in these animals treated with a cytotoxic DNA alkylating agent compared to control animals. We demonstrated that this tolerance to DNA damage is due to the survival of mitotically active, MMR-deficient stem cells. Our results suggest that loss of MMR provides an in vivo survival advantage to the stem cell population in the presence of DNA damage that may have implications for tumorigenesis.
The Down syndrome cell adhesion molecule (Dscam) gene has essential roles in neural wiring and pathogen recognition in Drosophila melanogaster. Dscam encodes 38,016 distinct isoforms via extensive alternative splicing. The 95 alternative exons in Dscam are organized into clusters that are spliced in a mutually exclusive manner. The exon 6 cluster contains 48 variable exons and uses a complex system of competing RNA structures to ensure that only one variable exon is included. Here we show that the heterogeneous nuclear ribonucleoprotein hrp36 acts specifically within, and throughout, the exon 6 cluster to prevent the inclusion of multiple exons. Moreover, hrp36 prevents serine/arginine-rich proteins from promoting the ectopic inclusion of multiple exon 6 variants. Thus, the fidelity of mutually exclusive splicing in the exon 6 cluster is governed by an intricate combination of alternative RNA structures and a globally acting splicing repressor.
RNAs can be physically classified into poly(A)+ or poly(A)- transcripts according to the presence or absence of a poly(A) tail at their 3' ends. Current deep sequencing approaches largely depend on the enrichment of transcripts with a poly(A) tail, and therefore offer little insight into the nature and expression of transcripts that lack poly(A) tails.
We have used deep sequencing to explore the repertoire of both poly(A)+ and poly(A)- RNAs from HeLa cells and H9 human embryonic stem cells (hESCs). Using stringent criteria, we found that while the majority of transcripts are poly(A)+, a significant portion of transcripts are either poly(A)- or bimorphic, being found in both the poly(A)+ and poly(A)- populations. Further analyses revealed that many mRNAs may not contain classical long poly(A) tails and such messages are overrepresented in specific functional categories. In addition, we surprisingly found that a few excised introns accumulate in cells and thus constitute a new class of non-polyadenylated long non-coding RNAs. Finally, we have identified a specific subset of poly(A)- histone mRNAs, including two histone H1 variants, that are expressed in undifferentiated hESCs and are rapidly diminished upon differentiation; further, these same histone genes are induced upon reprogramming of fibroblasts to induced pluripotent stem cells.
We offer a rich source of data that allows a deeper exploration of the poly(A)- landscape of the eukaryotic transcriptome. The approach we present here also applies to the analysis of the poly(A)- transcriptomes of other organisms.
Alternative splicing is typically thought to be controlled by RNA binding proteins that modulate the activity of the spliceosome. A new study not only demonstrates that alternative splicing can be regulated without the involvement of auxiliary splicing factors, but also provides mechanistic insight into how this can occur.
In this issue of Molecular Cell, Schwer (2008) demonstrates that during the latest stage of the splicing reaction the RNA-dependent helicase Prp22 is deposited upon the downstream exon where it subsequently strips the spliced messenger RNA from the spliceosome.
A new study reveals that extracellular signals can activate a signal-transduction cascade that simultaneously alters alternative splicing and translation of the same target. These concerted efforts probably serve to increase the speed and strength of the cellular response to changes in the extracellular environment.
The Drosophila fruitless (fru) gene encodes a transcription factor that essentially regulates all aspects of male courtship behavior. The use of alternative 5′-splice sites generates fru isoforms that determine gender-appropriate sexual behaviors. Alternative splicing of fru is regulated by TRA and TRA2 and depends on an exonic splicing enhancer (fruRE) consisting of three 13-nucleotide repeat elements, nearly identical to those that regulate alternative sex-specific 3′-splice site choice in the doublesex (dsx) gene. dsx has provided a useful model system to investigate the mechanisms of enhancer-dependent 3′-splice site choice. However, little is known about enhancer-dependent regulation of alternative 5′-splice sites. The mechanisms of this process were investigated using an in vitro system in which recombinant TRA/TRA2 could activate the female-specific 5′-splice site of fru. Mutational analysis demonstrated that one 13-nucleotide repeat element within the fruRE is required and sufficient to activate the regulated female-specific splice site. As was established for dsx, the fruRE can be replaced by a short element encompassing tandem 13-nucleotide repeat elements, by heterologous splicing enhancers, and by artificially tethering a splicing activator to the pre-mRNA. Complementation experiments showed that Ser/Arg-rich proteins facilitate enhancer-dependent 5′-splice site activation. We conclude that splicing enhancers function similarly in activating regulated 5′- and 3′-splice sites. These results suggest that exonic splicing enhancers recruit multiple spliceosomal components required for the initial recognition of 5′- and 3′-splice sites.
A new study in this issue of Molecular Cell (Pleiss et al., 2007b) shows that changes in the environment rapidly alter the splicing efficiency of specific pre-mRNAs in yeast.
RNA interference (RNAi) is a useful tool for degrading targeted messenger RNAs (mRNAs) and thus “knocking down” the abundance of the encoded protein. We have been using RNAi in cultured Drosophila cells to evaluate the effect of “knocking down” numerous mRNA processing factors on the alternative splicing of specific pre-mRNAs. This relatively simple technique has allowed us to identify a number of splicing factors that impact the alternative splicing of particular alternatively spliced exons. This approach can be extended to examine the splicing of nearly any gene.
RNA interference (RNAi); Drosophila melanogaster; Schneider (S2) cells; knock-down
Single-strand conformational polymorphism analysis has been used successfully to identify single nucleotide changes within sequences based on the fact that multidetection enhancement gels will separate molecules based on their conformation rather than their size. We have expanded the utility of this technique to analyze easily the alternative splicing of pre-mRNAs containing multiple mutually exclusive exons of the same size. We have used this technique to study the Caenorhabditis elegans let-2 gene containing two alternative exons and the Drosophilia melanogaster Dscam gene, which contains 12 mutually exclusive exons. The ease and the quantitative nature of this technique should be very useful.
Alternative splicing; single-strand conformational polymorphism (SSCP); exons
Numerous inherited human genetic disorders are caused by defects in pre-mRNA splicing. Two recent studies have added a new twist to the link between genetic variation and pre-mRNA splicing by identifying SNPs that correlate with heritable changes in alternative splicing but do not cause disease. This suggests that allele-specific alternative splicing is a mechanism that accounts for individual variation in the human population.
RNA interference (RNAi) is becoming a popular method for analyzing gene function in a variety of biological processes. We have used RNAi in cultured Drosophila cells to identify trans-acting factors that regulate the alternative splicing of endogenously transcribed pre-mRNAs. We have generated a dsRNA library comprising ~70% of the Drosophila genes encoding RNA binding proteins and assessed the function of each protein in the regulation of alternative splicing. This approach not only identiWes trans-acting factors regulating specific alternative splicing events, but also can provide insight into the alternative splicing regulatory networks of Drosophila. Here, we describe this RNAi approach to identify alternative splicing regulatory proteins in detail.
Alternative splicing; RNA interference; Drosophila
Drosophila Dscam encodes 38,016 distinct axon guidance receptors through the mutually exclusive alternative splicing of 95 variable exons. Importantly, known mechanisms that ensure the mutually exclusive splicing of pairs of exons cannot explain this phenomenon in Dscam. I have identified two classes of conserved elements in the Dscam exon 6 cluster, which contains 48 alternative exons—the docking site, located in the intron downstream of constitutive exon 5, and the selector sequences, which are located upstream of each exon 6 variant. Strikingly, each selector sequence is complementary to a portion of the docking site, and this pairing juxtaposes one, and only one, alternative exon to the upstream constitutive exon. The mutually exclusive nature of the docking site:selector sequence interactions suggests that the formation of these competing RNA structures is a central component of the mechanism guaranteeing that only one exon 6 variant is included in each Dscam mRNA.
A report on the 2nd Symposium on Alternative Transcript Diversity, Heidelberg, Germany, 21-23 March 2006.
A report on the 2nd Symposium on Alternative Transcript Diversity, Heidelberg, Germany, 21-23 March 2006.
The Drosophila Dscam gene encodes 38,016 different proteins, due to alternative splicing of 95 of its 115 exons, that function in axon guidance and innate immunity. The alternative exons are organized into four clusters, and the exons within each cluster are spliced in a mutually exclusive manner. Here we describe an evolutionarily conserved RNA secondary structure we call the Inclusion Stem (iStem) that is required for efficient inclusion of all 12 variable exons in the exon 4 cluster. Although the iStem governs inclusion or exclusion of the entire exon 4 cluster, it does not play a significant role in determining which variable exon is selected. Thus, the iStem is a novel type of regulatory element that simultaneously controls the splicing of multiple alternative exons.
SR proteins are essential pre-mRNA splicing factors that have been shown to bind a number of exonic splicing enhancers where they function to stimulate the splicing of adjacent introns. Members of the SR protein family contain one or two N-terminal RNA binding domains, as well as a C-terminal arginine–serine (RS) rich domain. The RS domains mediate protein–protein interactions with other RS domain containing proteins and are essential for many, but not all, SR protein functions. Hybrid proteins containing an RS domain fused to the bacteriophage MS2 coat protein are sufficient to activate enhancer-dependent splicing in HeLa cell nuclear extract when bound to the pre-mRNA. Here we report progress towards determining the protein sequence requirements for RS domain function. We show that the RS domains from non-SR proteins can also function as splicing activation domains when tethered to the pre-mRNA. Truncation experiments with the RS domain of the human SR protein 9G8 identified a 29 amino acid segment, containing 26 arginine or serine residues, that is sufficient to activate splicing when fused to MS2. We also show that synthetic domains composed solely of RS dipeptides are capable of activating splicing, although their potency is proportional to their size.