CRISPR/Cas systems constitute a widespread class of immunity systems that protect bacteria and archaea against phages and plasmids, and commonly use repeat/spacer-derived short crRNAs to silence foreign nucleic acids in a sequence-specific manner. Although the maturation of crRNAs represents a key event in CRISPR activation, the responsible endoribonucleases (CasE, Cas6, Csy4) are missing in many CRISPR/Cas subtypes. Here, differential RNA sequencing of the human pathogen Streptococcus pyogenes uncovered tracrRNA, a trans-encoded small RNA with 24 nucleotide complementarity to the repeat regions of crRNA precursor transcripts. We show that tracrRNA directs the maturation of crRNAs by the activities of the widely conserved endogenous RNase III and the CRISPR-associated Csn1 protein; all these components are essential to protect S. pyogenes against prophage-derived DNA. Our study reveals a novel pathway of small guide RNA maturation and the first example of a host factor (RNase III) required for bacterial RNA-mediated immunity against invaders.
Phages are the most abundant biological entities on earth and pose a constant challenge to their bacterial hosts. Thus, bacteria have evolved numerous ‘innate’ mechanisms of defense against phage, such as abortive infection or restriction/modification systems. In contrast, the clustered regularly interspaced short palindromic repeats (CRISPR) systems provide acquired, yet heritable, sequence-specific ‘adaptive’ immunity against phage and other horizontally-acquired elements, such as plasmids. Resistance is acquired following viral infection or plasmid uptake when a short sequence of the foreign genome is added to the CRISPR array. CRISPRs are then transcribed and processed, generally by CRISPR associated (Cas) proteins, into short interfering RNAs (crRNAs), which form part of a ribonucleoprotein complex. This complex guides the crRNA to the complementary invading nucleic acid and targets this for degradation. Recently, there have been rapid advances in our understanding of CRISPR/Cas systems. In this review, we will present the current model(s) of the molecular events involved in both the acquisition of immunity and interference stages and will also address recent progress in our knowledge of the regulation of CRISPR/Cas systems.
phages; plasmids; horizontal gene transfer; CRISPR; Cas; cascade; PAM; crRNA; resistance
CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated sequences) is a recently discovered prokaryotic defense system against foreign DNA, including viruses and plasmids. CRISPR cassette is transcribed as a continuous transcript (pre-crRNA), which is processed by Cas proteins into small RNA molecules (crRNAs) that are responsible for defense against invading viruses. Experiments in E. coli report that overexpression of cas genes generates a large number of crRNAs, from only few pre-crRNAs.
We here develop a minimal model of CRISPR processing, which we parameterize based on available experimental data. From the model, we show that the system can generate a large amount of crRNAs, based on only a small decrease in the amount of pre-crRNAs. The relationship between the decrease of pre-crRNAs and the increase of crRNAs corresponds to strong linear amplification. Interestingly, this strong amplification crucially depends on fast non-specific degradation of pre-crRNA by an unidentified nuclease. We show that overexpression of cas genes above a certain level does not result in further increase of crRNA, but that this saturation can be relieved if the rate of CRISPR transcription is increased. We furthermore show that a small increase of CRISPR transcription rate can substantially decrease the extent of cas gene activation necessary to achieve a desired amount of crRNA.
The simple mathematical model developed here is able to explain existing experimental observations on CRISPR transcript processing in Escherichia coli. The model shows that a competition between specific pre-crRNA processing and non-specific degradation determines the steady-state levels of crRNA and is responsible for strong linear amplification of crRNAs when cas genes are overexpressed. The model further shows how disappearance of only a few pre-crRNA molecules normally present in the cell can lead to a large (two orders of magnitude) increase of crRNAs upon cas overexpression. A crucial ingredient of this large increase is fast non-specific degradation by an unspecified nuclease, which suggests that a yet unidentified nuclease(s) is a major control element of CRISPR response. Transcriptional regulation may be another important control mechanism, as it can either increase the amount of generated pre-crRNA, or alter the level of cas gene activity.
This article was reviewed by Mikhail Gelfand, Eugene Koonin and L Aravind.
CRISPR/Cas; Transcript processing; Small RNA; CRISPR expression regulation; CRISPR/Cas response
Because of the mutagenic consequences of mobile genetic elements, elaborate defenses have evolved to restrict their activity. A major system that controls the activity of transposable elements (TEs) in flies and vertebrates is mediated by Piwi-interacting RNAs (piRNAs), which are ~24–30 nucleotide RNAs that are bound by Piwi-class effectors. The piRNA system is thought to provide primarily a germline defense against TE activity.
Here, we describe a second system that represses Drosophila TEs by using endogenous small interfering RNAs (si RNAs), which are 21 nucleotide, 3′-end-modified RNAs that are dependent on Dicer-2 and Argonaute-2. In contrast to piRNAs, we find that the TE-siRNA system is active in somatic tissues, and particularly so in various immortalized cell lines. Analysis of the patterns and properties of TE-derived small RNAs reveals further distinctions between TE regions and genomic loci that are converted into piRNAs and siRNAs, respectively. Finally, functional tests show that many transposon transcripts accumulate to higher levels in cells and animal tissues that are deficient for Dicer-2 or Argonaute-2.
Drosophila utilizes two small-RNA systems to restrict transposon activity in the germline (mostly via piRNAs) and in the soma (mostly via siRNAs).
Piwi-interacting RNAs (piRNAs) are a recently discovered class of 24- to 30-nt noncoding RNAs whose best-understood function is to repress transposable elements (TEs) in animal germ lines. In humans, TE-derived sequences comprise ∼45% of the genome and there are several active TE families, including LINE-1 and Alu elements, which are a significant source of de novo mutations and intrapopulation variability. In the “ping-pong model,” piRNAs are thought to alternatively cleave sense and antisense TE transcripts in a positive feedback loop. Because piRNAs are poorly conserved between closely related species, including human and chimpanzee, we took a population genomics approach to study piRNA function and evolution. We found strong statistical evidence that piRNA sequences are under selective constraint in African populations. We then mapped the piRNA sequences to human TE sequences and found strong correlations between the age of each LINE-1 and Alu subfamily and the number of piRNAs mapping to the subfamily. This result supports the idea that piRNAs function as repressors of TEs in humans. Finally, we observed a significant depletion of piRNA matches in the reverse transcriptase region of the consensus human LINE-1 element but not of the consensus mouse LINE-1 element. This result suggests that reverse transcriptase might have an endogenous role specific to humans. Overall, our results elucidate the function and evolution of piRNAs in humans and highlight the utility of population genomics analysis for studying this rapidly evolving genetic system.
piRNAs; transposable elements; population genetics; selective constraint; Africans
Piwi-interacting RNAs (piRNAs) are a recently discovered class of small non-coding RNA found in animals. PiRNAs are primarily expressed in the germline where their best understood function is to repress transposable elements. Unlike previous studies that investigated the evolution of piRNA-generating loci at the level of nucleotide substitutions, here we studied the evolution of piRNA-generating loci at the level of copy number variation (i.e. duplications and deletions) using genome-wide copy number variation data from three human populations. Our analysis shows that at the level of copy number variation there is strong selective constraint and a very high mutation rate in human piRNA-generating loci. Our results differ from a model of positive selection on copy number variation in piRNA-generating loci previously proposed in rodents. We discuss possible reasons for this difference based on the transposable element insertion histories in the rodent and primate lineages.
The Cutoff protein regulates piRNA cluster expression and piRNA production in the Drosophila germline
The identity and function of many factors involved in the piRNA pathway remain unknown. Here, in Drosophila, cutoff plays a role in regulating piRNA cluster transcript levels and biogenesis together with the heterochromatin protein Rhino.
In a broad range of organisms, Piwi-interacting RNAs (piRNAs) have emerged as core components of a surveillance system that protects the genome by silencing transposable and repetitive elements. A vast proportion of piRNAs is produced from discrete genomic loci, termed piRNA clusters, which are generally embedded in heterochromatic regions. The molecular mechanisms and the factors that govern their expression are largely unknown. Here, we show that Cutoff (Cuff), a Drosophila protein related to the yeast transcription termination factor Rai1, is essential for piRNA production in germline tissues. Cuff accumulates at centromeric/pericentromeric positions in germ-cell nuclei and strongly colocalizes with the major heterochromatic domains. Remarkably, we show that Cuff is enriched at the dual-strand piRNA cluster 1/42AB and is likely to be involved in regulation of transcript levels of similar loci dispersed in the genome. Consistent with this observation, Cuff physically interacts with the Heterochromatin Protein 1 (HP1) variant Rhino (Rhi). Our results unveil a link between Cuff activity, heterochromatin assembly and piRNA cluster expression, which is critical for stem-cell and germ-cell development in Drosophila.
cutoff; Drosophila; germline; heterochromatin; piRNA
In Drosophila, Piwi proteins associate with Piwi-interacting RNAs (piRNAs) and protect the germline genome by silencing mobile genetic elements. This defense system acts in germline and gonadal somatic tissue to preserve germline development. Genetic control for these silencing pathways varies greatly between tissues of the gonad. Here, we identified Vreteno (Vret), a novel gonad-specific protein essential for germline development. Vret is required for piRNA-based transposon regulation in both germline and somatic gonadal tissues. We show that Vret, which contains Tudor domains, associates physically with Piwi and Aubergine (Aub), stabilizing these proteins via a gonad-specific mechanism that is absent in other fly tissues. In the absence of vret, Piwi-bound piRNAs are lost without changes in piRNA precursor transcript production, supporting a role for Vret in primary piRNA biogenesis. In the germline, piRNAs can engage in an Aub- and Argonaute 3 (AGO3)-dependent amplification in the absence of Vret, suggesting that Vret function can distinguish between primary piRNAs loaded into Piwi-Aub complexes and piRNAs engaged in the amplification cycle. We propose that Vret plays an essential role in transposon regulation at an early stage of primary piRNA processing.
Germline stem cell; Soma; Transposon; Piwi; Aubergine; piRNAs; Tudor; Drosophila
PIWI-interacting RNAs (piRNAs) are germline-specific small non-coding RNAs that form piRNA-induced silencing complexes (piRISCs) by associating with PIWI proteins, a subclade of the Argonaute proteins predominantly expressed in the germline. piRISCs protect the integrity of the germline genome from invasive transposable DNA elements by silencing them. Multiple piRNA biogenesis factors have been identified in Drosophila. The majority of piRNA factors are localized in the nuage, electron-dense non-membranous cytoplasmic structures located in the perinuclear regions of germ cells. Thus, piRNA biogenesis is thought to occur in the nuage in germ cells. Immunofluorescence analyses of ovaries from piRNA factor mutants have revealed a localization hierarchy of piRNA factors in female nuage. However, whether this hierarchy is female-specific or can also be applied in male gonads remains undetermined. Here, we show by immunostaining of both ovaries and testes from piRNA factor mutants that the molecular hierarchy of piRNA factors shows gender-specificity, especially for Krimper (Krimp), a Tudor-domain-containing protein of unknown function(s): Krimp is dispensable for PIWI protein Aubergine (Aub) nuage localization in ovaries but Krimp and Aub require each other for their proper nuage localization in testes. This suggests that the functional requirement of Krimp in piRNA biogenesis may be different in male and female gonads.
nuage; piRNA; PIWI; Drosophila; germline
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
Prokaryotes have developed several strategies to defend themselves against foreign genetic elements. One of those defense mechanisms is the recently identified CRISPR/Cas system, which is used by approximately half of all bacterial and almost all archaeal organisms. The CRISPR/Cas system differs from the other defense strategies because it is adaptive, hereditary and it recognizes the invader by a sequence specific mechanism. To identify the invading foreign nucleic acid, a crRNA that matches the invader DNA is required, as well as a short sequence motif called protospacer adjacent motif (PAM). We recently identified the PAM sequences for the halophilic archaeon Haloferax volcanii, and found that several motifs were active in triggering the defense reaction. In contrast, selection of protospacers from the invader seems to be based on fewer PAM sequences, as evidenced by comparative sequence data. This suggests that the selection of protospacers has stricter requirements than the defense reaction. Comparison of CRISPR-repeat sequences carried by sequenced haloarchaea revealed that in more than half of the species, the repeat sequence is conserved and that they have the same CRISPR/Cas type.
Haloferax volcanii; CRISPR/Cas; PAM; archaea; prokaryotic immune system; haloarchaea
Streptococcus thermophilus, similar to other Bacteria and Archaea, has developed defense mechanisms to protect cells against invasion by foreign nucleic acids, such as virus infections and plasmid transformations. One defense system recently described in these organisms is the CRISPR-Cas system (Clustered Regularly Interspaced Short Palindromic Repeats loci coupled to CRISPR-associated genes). Two S. thermophilus CRISPR-Cas systems, CRISPR1-Cas and CRISPR3-Cas, have been shown to actively block phage infection. The CRISPR1-Cas system interferes by cleaving foreign dsDNA entering the cell in a length-specific and orientation-dependant manner. Here, we show that the S. thermophilus CRISPR3-Cas system acts by cleaving phage dsDNA genomes at the same specific position inside the targeted protospacer as observed with the CRISPR1-Cas system. Only one cleavage site was observed in all tested strains. Moreover, we observed that the CRISPR1-Cas and CRISPR3-Cas systems are compatible and, when both systems are present within the same cell, provide increased resistance against phage infection by both cleaving the invading dsDNA. We also determined that overall phage resistance efficiency is correlated to the total number of newly acquired spacers in both CRISPR loci.
Piwi-interacting RNAs (piRNAs) fulfill a critical, conserved role in defending the genome against foreign genetic elements. In many organisms, piRNAs appear to be derived from processing of a long, polycistronic RNA precursor. Here, we establish that each Caenorhabditis elegans piRNA represents a tiny, autonomous transcriptional unit. Remarkably, the minimal C. elegans piRNA cassette requires only a 21 nucleotide (nt) piRNA sequence and an ∼50 nt upstream motif with limited genomic context for expression. Combining computational analyses with a novel, in vivo transgenic system, we demonstrate that this upstream motif is necessary for independent expression of a germline-enriched, Piwi-dependent piRNA. We further show that a single nucleotide position within this motif directs differential germline enrichment. Accordingly, over 70% of C. elegans piRNAs are selectively expressed in male or female germline, and comparison of the genes they target suggests that these two populations have evolved independently. Together, our results indicate that C. elegans piRNA upstream motifs act as independent promoters to specify which sequences are expressed as piRNAs, how abundantly they are expressed, and in what germline. As the genome encodes well over 15,000 unique piRNA sequences, our study reveals that the number of transcriptional units encoding piRNAs rivals the number of mRNA coding genes in the C. elegans genome.
Across the animal kingdom, Piwi-interacting small RNAs (piRNAs) protect genome integrity and promote fertility. While the functions of piRNAs are well-characterized, far less is known about how they are generated and how their expression is regulated. In the Caenorhabditis elegans genome, a conserved sequence motif lies upstream of many piRNA loci and appears to regulate their expression. We combined computational and experimental approaches to investigate the role of this motif in the expression of C. elegans piRNAs. We discovered that >70% of piRNAs are differentially enriched in male versus female germline, and these male and female piRNAs show different upstream motifs. Using a transgenic system for expressing synthetic piRNAs in vivo, we demonstrate that variation of a single nucleotide within this motif influences piRNA germline enrichment. We further show that the conserved motif is capable of driving piRNA expression in genomic isolation. Accordingly, the genomic distribution of these motifs determines which sequences are expressed as piRNAs in C. elegans. Our results suggest that each C. elegans piRNA represents an independent transcript whose sequence, abundance, and germline enrichment are encoded by a variant upstream motif, defining a novel modality for expression of piRNAs.
Protecting the genome from transposable element (TE) mobilization is critical for germline development. In Drosophila, Piwi proteins and their bound small RNAs (piRNAs) provide a potent defense against TE activity. TE targeting piRNAs are processed from TE-dense heterochromatic loci termed ‘piRNA clusters’. While piRNA biogenesis from cluster precursors is beginning to be understood, little is known about piRNA cluster transcriptional regulation. Here we show that deposition of histone 3 lysine 9 by the methyltransferase dSETDB1 (egg) is required for piRNA cluster transcription. In the absence of dSETDB1, cluster precursor transcription collapses in germline and somatic gonadal cells and TEs are activated, resulting in germline loss and a block in germline stem cell differentiation. We propose that heterochromatin protects the germline by activating the piRNA pathway.
Studies of the Escherichia, Neisseria, Thermotoga, and Mycobacteria clustered regularly interspaced short palindromic repeat (CRISPR) subtypes have resulted in a model whereby CRISPRs function as a defense system against bacteriophage infection and conjugative plasmid transfer. In contrast, we previously showed that the Yersinia-subtype CRISPR region of Pseudomonas aeruginosa strain UCBPP-PA14 plays no detectable role in viral immunity but instead is required for bacteriophage DMS3-dependent inhibition of biofilm formation by P. aeruginosa. The goal of this study is to define the components of the Yersinia-subtype CRISPR region required to mediate this bacteriophage-host interaction. We show that the Yersinia-subtype-specific CRISPR-associated (Cas) proteins Csy4 and Csy2 are essential for small CRISPR RNA (crRNA) production in vivo, while the Csy1 and Csy3 proteins are not absolutely required for production of these small RNAs. Further, we present evidence that the core Cas protein Cas3 functions downstream of small crRNA production and that this protein requires functional HD (predicted phosphohydrolase) and DEXD/H (predicted helicase) domains to suppress biofilm formation in DMS3 lysogens. We also determined that only spacer 1, which is not identical to any region of the DMS3 genome, mediates the CRISPR-dependent loss of biofilm formation. Our evidence suggests that gene 42 of phage DMS3 (DMS3-42) is targeted by CRISPR2 spacer 1 and that this targeting tolerates multiple point mutations between the spacer and DMS3-42 target sequence. This work demonstrates how the interaction between P. aeruginosa strain UCBPP-PA14 and bacteriophage DMS3 can be used to further our understanding of the diverse roles of CRISPR system function in bacteria.
Piwi Argonautes and Piwi-interacting RNAs (piRNAs) mediate genome defense by targeting transposons. However, many piRNA species lack obvious sequence complementarity to transposons or other loci; only one C. elegans transposon is a known piRNA target. Here we show that, in mutants lacking the Piwi Argonaute PRG-1 (and consequently its associated piRNAs/21U-RNAs), many silent loci in the germline exhibit increased levels of mRNA expression and depletion of an amplified RNA-dependent RNA polymerase (RdRP)-derived species of small secondary RNA termed 22G-RNAs. Sequences depleted of 22G-RNAs are enriched at nearby potential target sites that base pair imperfectly but extensively to 21U-RNAs. We show that PRG-1 is required to initiate, but not to maintain, silencing of transgenes engineered to contain complementarity to endogenous 21U-RNAs. Our findings support a model in which C. elegans piRNAs utilize their enormous repertoire of targeting capacity to scan the germline transcriptome for foreign sequences, while endogenous germline-expressed genes are actively protected from piRNA-induced silencing.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1)1-3. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12 .
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4.
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology; Issue 67; CRISPR/Cas; endonuclease; in vitro transcription; crRNA; Cas6
CRISPR/Cas is a recently discovered prokaryotic immune system, which is based on small RNAs (“spacers”) that restrict phage and plasmid infection. It has been hypothesized that CRISPRs can also regulate self gene expression by utilizing spacers that target self genes. By analyzing CRISPRs from 330 organisms we found that one in every 250 spacers is self targeting, and that such self-targeting occurs in 18% of all CRISPR-bearing organisms. However, complete lack of conservation across species, combined with abundance of degraded repeats near self-targeting spacers, suggests that self-targeting is a consequence of autoimmunity rather than gene regulation. We propose that accidental incorporation of self nucleic-acids by CRISPR can incur an autoimmune fitness cost, which may explain the abundance of degraded CRISPR systems across prokaryotes.
Small non-coding RNAs have emerged as key players in epigenetic regulation. Recently, a novel class of small RNAs that interact with Piwi proteins has been discovered in the mammalian and Drosophila germline. These Piwi-interacting RNAs (piRNAs) represent a distinct small RNA pathway that is widely thought to function only in the germline. In this essay, we review our recent work with our collaborators on the epigenetic function of the Drosophila Piwi protein and its associated piRNAs in somatic cells. This work has revealed a novel epigenetic mechanism mediated by Piwi and its associated piRNAs in somatic cells that might also be applicable to the germline. Based on these results, we propose a “Piwi-piRNA guidance hypothesis” for Piwi/piRNA-mediated epigenetic programming, in which the Piwi-piRNA complex serves as a sequence-recognition machinery that recruits epigenetic effectors such as Heterochromatin Protein 1a (HP1a) to specific sites in the genome to execute epigenetic regulation.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
Many prokaryotes contain genomic clustered regularly interspaced short palindromic repeats (CRISPRs) that confer resistance to invasive genetic elements. Central to this immune system is the production of CRISPR-derived RNAs (crRNAs) following transcription of the CRISPR locus. Here we identify the endoribonuclease (Csy4) responsible for pre-crRNA processing in Pseudomonas aeruginosa. A 1.8 Å crystal structure of Csy4 in complex with its cognate RNA reveals an unexpected recognition mechanism whereby Csy4 makes sequence-specific interactions in the major groove of the CRISPR repeat stem-loop. Together with electrostatic contacts to the phosphate backbone, these enable Csy4 to selectively bind and cleave pre-crRNAs. The active site of Csy4 comprises two invariant residues, a serine and a histidine. The RNA recognition mechanism identified here explains sequence- and structure-specific processing by a large family of CRISPR-specific endoribonucleases.
Hybrids of two Drosophila species show transposable element derepression and piRNA pathway malfunction, revealing adaptive evolution of piRNA pathway components.
The Piwi-interacting RNA (piRNA) pathway defends the germline of animals from the deleterious activity of selfish transposable elements (TEs) through small-RNA mediated silencing. Adaptation to novel invasive TEs is proposed to occur by incorporating their sequences into the piRNA pool that females produce and deposit into their eggs, which then propagates immunity against specific TEs to future generations. In support of this model, the F1 offspring of crosses between strains of the same Drosophila species sometimes suffer from germline derepression of paternally inherited TE families, caused by a failure of the maternal strain to produce the piRNAs necessary for their regulation. However, many protein components of the Drosophila piRNA pathway exhibit signatures of positive selection, suggesting that they also contribute to the evolution of host genome defense. Here we investigate piRNA pathway function and TE regulation in the F1 hybrids of interspecific crosses between D. melanogaster and D. simulans and compare them with intraspecific control crosses of D. melanogaster. We confirm previous reports showing that intraspecific crosses are characterized by derepression of paternally inherited TE families that are rare or absent from the maternal genome and piRNA pool, consistent with the role of maternally deposited piRNAs in shaping TE silencing. In contrast to the intraspecific cross, we discover that interspecific hybrids are characterized by widespread derepression of both maternally and paternally inherited TE families. Furthermore, the pattern of derepression of TE families in interspecific hybrids cannot be attributed to their paucity or absence from the piRNA pool of the maternal species. Rather, we demonstrate that interspecific hybrids closely resemble piRNA effector-protein mutants in both TE misregulation and aberrant piRNA production. We suggest that TE derepression in interspecific hybrids largely reflects adaptive divergence of piRNA pathway genes rather than species-specific differences in TE-derived piRNAs.
Eukaryotic genomes contain large quantities of transposable elements (TEs), short self-replicating DNA sequences that can move within the genome. The selfish replication of TEs has potentially drastic consequences for the host, such as disruption of gene function, induction of sterility, and initiation or exacerbation of some cancers. Like the adaptive immune system that defends our bodies against pathogens, the Piwi-interacting RNA (piRNA) pathway defends animal genomes against the harmful effects of TEs. Fundamental to piRNA-mediated defense is the production of small noncoding RNAs that act like antibodies to target replicating TEs for destruction by piRNA-effector proteins. piRNAs are expected to diverge rapidly between species in response to genome infection by increasingly disparate TEs. Here, we tested this hypothesis by examining how differences in piRNAs between two species of fruit fly relate to TE “immunity” in their hybrid offspring. Because piRNAs are maternally deposited, we expected excessive replication of paternal TEs in hybrids. Surprisingly, we observe increased activity of both maternal and paternal TEs, together with defects in piRNA production that are reminiscent of piRNA effector-protein mutants. Our observations reveal that piRNA effector-proteins do not function properly in hybrids, and we propose that adaptive evolution among piRNA effector-proteins contributes to host genome defense and leads to the functional incompatibilities that we observe in hybrids.
Piwi-interacting RNA (piRNA) are small RNA abundant in the germline across animal species. In fruit flies and mice, piRNA have been implicated in maintenance of genomic integrity by transposable elements silencing. Outside of the germline, piRNA have only been found in fruit fly ovarian follicle cells. Previous studies have further reported presence of multiple piRNA-like small RNA (pilRNA) in fly heads and a small number of pilRNA have been reported in mouse tissues and in human NK cells. Here, we analyze high-throughput small RNA sequencing data in more than 130 fruit fly, mouse and rhesus macaque samples. The results show widespread presence of pilRNA, displaying all known characteristics of piRNA in multiple somatic tissues of these three species. In mouse pancreas and macaque epididymis, pilRNA abundance was compatible with piRNA abundance in the germline. Using in situ hybridizations, we further demonstrate pilRNA co-localization with mRNA expression of Piwi-family genes in all macaque tissues. Further, using western blot, we have shown the expression of Miwi protein in mouse pancreas. These findings indicate that piRNA-like molecules might play important roles outside of the germline.
The CRISPR/Cas adaptive immune system provides resistance against phages and plasmids in Archaea and Bacteria. CRISPR loci integrate short DNA sequences from invading genetic elements that provide small RNA-mediated interference in subsequent exposure to matching nucleic acids. In Streptococcus thermophilus, it was previously shown that the CRISPR1/Cas system can provide adaptive immunity against phages and plasmids by integrating novel spacers following exposure to these foreign genetic elements that subsequently direct the specific cleavage of invasive homologous DNA sequences. Here, we show that the S. thermophilus CRISPR3/Cas system can be transferred into Escherichia coli and provide heterologous protection against plasmid transformation and phage infection. We show that interference is sequence-specific, and that mutations in the vicinity or within the proto-spacer adjacent motif (PAM) allow plasmids to escape CRISPR-encoded immunity. We also establish that cas9 is the sole cas gene necessary for CRISPR-encoded interference. Furthermore, mutation analysis revealed that interference relies on the Cas9 McrA/HNH- and RuvC/RNaseH-motifs. Altogether, our results show that active CRISPR/Cas systems can be transferred across distant genera and provide heterologous interference against invasive nucleic acids. This can be leveraged to develop strains more robust against phage attack, and safer organisms less likely to uptake and disseminate plasmid-encoded undesirable genetic elements.
CRISPR-Cas systems are recently discovered, RNA-based immune systems that control invasions of viruses and plasmids in archaea and bacteria. Prokaryotes with CRISPR-Cas immune systems capture short invader sequences within the CRISPR loci in their genomes, and small RNAs produced from the CRISPR loci (CRISPR (cr)RNAs) guide Cas proteins to recognize and degrade (or otherwise silence) the invading nucleic acids. There are multiple variations of the pathway found among prokaryotes, each mediated by largely distinct components and mechanisms that we are only beginning to delineate. Here we will review our current understanding of the remarkable CRISPR-Cas pathways with particular attention to studies relevant to systems found in the archaea.