CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated sequences) is a recently discovered prokaryotic defense system against foreign DNA, including viruses and plasmids. CRISPR cassette is transcribed as a continuous transcript (pre-crRNA), which is processed by Cas proteins into small RNA molecules (crRNAs) that are responsible for defense against invading viruses. Experiments in E. coli report that overexpression of cas genes generates a large number of crRNAs, from only few pre-crRNAs.
We here develop a minimal model of CRISPR processing, which we parameterize based on available experimental data. From the model, we show that the system can generate a large amount of crRNAs, based on only a small decrease in the amount of pre-crRNAs. The relationship between the decrease of pre-crRNAs and the increase of crRNAs corresponds to strong linear amplification. Interestingly, this strong amplification crucially depends on fast non-specific degradation of pre-crRNA by an unidentified nuclease. We show that overexpression of cas genes above a certain level does not result in further increase of crRNA, but that this saturation can be relieved if the rate of CRISPR transcription is increased. We furthermore show that a small increase of CRISPR transcription rate can substantially decrease the extent of cas gene activation necessary to achieve a desired amount of crRNA.
The simple mathematical model developed here is able to explain existing experimental observations on CRISPR transcript processing in Escherichia coli. The model shows that a competition between specific pre-crRNA processing and non-specific degradation determines the steady-state levels of crRNA and is responsible for strong linear amplification of crRNAs when cas genes are overexpressed. The model further shows how disappearance of only a few pre-crRNA molecules normally present in the cell can lead to a large (two orders of magnitude) increase of crRNAs upon cas overexpression. A crucial ingredient of this large increase is fast non-specific degradation by an unspecified nuclease, which suggests that a yet unidentified nuclease(s) is a major control element of CRISPR response. Transcriptional regulation may be another important control mechanism, as it can either increase the amount of generated pre-crRNA, or alter the level of cas gene activity.
This article was reviewed by Mikhail Gelfand, Eugene Koonin and L Aravind.
CRISPR/Cas; Transcript processing; Small RNA; CRISPR expression regulation; CRISPR/Cas response
Background: The Cas6 protein is required for generating crRNAs in CRISPR-Cas I and III systems.
Results: The Cas6 protein is necessary for crRNA production but not sufficient for crRNA maintenance in Haloferax.
Conclusion: A Cascade-like complex is required in the type I-B system for a stable crRNA population.
Significance: The CRISPR-Cas system I-B has a similar Cascade complex like types I-A and I-E.
The clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR-Cas) system is a prokaryotic defense mechanism against foreign genetic elements. A plethora of CRISPR-Cas versions exist, with more than 40 different Cas protein families and several different molecular approaches to fight the invading DNA. One of the key players in the system is the CRISPR-derived RNA (crRNA), which directs the invader-degrading Cas protein complex to the invader. The CRISPR-Cas types I and III use the Cas6 protein to generate mature crRNAs. Here, we show that the Cas6 protein is necessary for crRNA production but that additional Cas proteins that form a CRISPR-associated complex for antiviral defense (Cascade)-like complex are needed for crRNA stability in the CRISPR-Cas type I-B system in Haloferax volcanii in vivo. Deletion of the cas6 gene results in the loss of mature crRNAs and interference. However, cells that have the complete cas gene cluster (cas1–8b) removed and are transformed with the cas6 gene are not able to produce and stably maintain mature crRNAs. crRNA production and stability is rescued only if cas5, -6, and -7 are present. Mutational analysis of the cas6 gene reveals three amino acids (His-41, Gly-256, and Gly-258) that are essential for pre-crRNA cleavage, whereas the mutation of two amino acids (Ser-115 and Ser-224) leads to an increase of crRNA amounts. This is the first systematic in vivo analysis of Cas6 protein variants. In addition, we show that the H. volcanii I-B system contains a Cascade-like complex with a Cas7, Cas5, and Cas6 core that protects the crRNA.
Archaea; Microbiology; Molecular Biology; Molecular Genetics; Protein Complexes; CRISPR/Cas; Cas6; Haloferax volcanii; crRNA; Type I-B
The prokaryotic antiviral defense systems CRISPR (clustered regularly interspaced short palindromic repeats)/Cas (CRISPR-associated) employs short crRNAs (CRISPR RNAs) to target invading viral nucleic acids. A short spacer sequence of these crRNAs can be derived from a viral genome and recognizes a reoccurring attack of a virus via base complementarity. We analyzed the effect of spacer sequences on the maturation of crRNAs of the subtype I-B Methanococcus maripaludis C5 CRISPR cluster. The responsible endonuclease, termed Cas6b, bound non-hydrolyzable repeat RNA as a dimer and mature crRNA as a monomer. Comparative analysis of Cas6b processing of individual spacer-repeat-spacer RNA substrates and crRNA stability revealed the potential influence of spacer sequence and length on these parameters. Correlation of these observations with the variable abundance of crRNAs visualized by deep-sequencing analyses is discussed. Finally, insertion of spacer and repeat sequences with archaeal poly-T termination signals is suggested to be prevented in archaeal CRISPR/Cas systems.
CRISPR; Cas6; endonuclease; crRNA; in-line probing; RNA binding; transcription termination
Clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated Cas proteins comprise a prokaryotic RNA-guided adaptive immune system that interferes with mobile genetic elements, such as plasmids and phages. The type I-E CRISPR interference complex Cascade from Escherichia coli is composed of five different Cas proteins and a 61-nt-long guide RNA (crRNA). crRNAs contain a unique 32-nt spacer flanked by a repeat-derived 5′ handle (8 nt) and a 3′ handle (21 nt). The spacer part of crRNA directs Cascade to DNA targets. Here, we show that the E. coli Cascade can be expressed and purified from cells lacking crRNAs and loaded in vitro with synthetic crRNAs, which direct it to targets complementary to crRNA spacer. The deletion of even one nucleotide from the crRNA 5′ handle disrupted its binding to Cascade and target DNA recognition. In contrast, crRNA variants with just a single nucleotide downstream of the spacer part bound Cascade and the resulting ribonucleotide complex containing a 41-nt-long crRNA specifically recognized DNA targets. Thus, the E. coli Cascade-crRNA system exhibits significant flexibility suggesting that this complex can be engineered for applications in genome editing and opening the way for incorporation of site-specific labels in crRNA.
All immune systems must distinguish self from non-self to repel invaders without inducing autoimmunity. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci protect bacteria and archaea from invasion by phage and plasmid DNA through a genetic interference pathway1–9. CRISPR loci are present in ~ 40% and ~90% of sequenced bacterial and archaeal genomes respectively10 and evolve rapidly, acquiring new spacer sequences to adapt to highly dynamic viral populations1, 11–13. Immunity requires a sequence match between the invasive DNA and the spacers that lie between CRISPR repeats1–9. Each cluster is genetically linked to a subset of the cas (CRISPR-associated) genes14–16 that collectively encode >40 families of proteins involved in adaptation and interference. CRISPR loci encode small CRISPR RNAs (crRNAs) that contain a full spacer flanked by partial repeat sequences2, 17–19. CrRNA spacers are thought to identify targets by direct Watson-Crick pairing with invasive “protospacer” DNA2, 3, but how they avoid targeting the spacer DNA within the encoding CRISPR locus itself is unknown. Here we have defined the mechanism of CRISPR self/non-self discrimination. In Staphylococcus epidermidis, target/crRNA mismatches at specific positions outside of the spacer sequence license foreign DNA for interference, whereas extended pairing between crRNA and CRISPR DNA repeats prevents autoimmunity. Hence, this CRISPR system uses the base-pairing potential of crRNAs not only to specify a target but also to spare the bacterial chromosome from interference. Differential complementarity outside of the spacer sequence is a built-in feature of all CRISPR systems, suggesting that this mechanism is a broadly applicable solution to the self/non-self dilemma that confronts all immune pathways.
CRISPR-Cas is a rapidly evolving RNA-mediated adaptive immune system that protects bacteria and archaea against mobile genetic elements. The system relies on the activity of short mature CRISPR RNAs (crRNAs) that guide Cas protein(s) to silence invading nucleic acids. A set of CRISPR-Cas, type II, requires a trans-activating small RNA, tracrRNA, for maturation of precursor crRNA (pre-crRNA) and interference with invading sequences. Following co-processing of tracrRNA and pre-crRNA by RNase III, dual-tracrRNA:crRNA guides the CRISPR-associated endonuclease Cas9 (Csn1) to cleave site-specifically cognate target DNA. Here, we screened available genomes for type II CRISPR-Cas loci by searching for Cas9 orthologs. We analyzed 75 representative loci, and for 56 of them we predicted novel tracrRNA orthologs. Our analysis demonstrates a high diversity in cas operon architecture and position of the tracrRNA gene within CRISPR-Cas loci. We observed a correlation between locus heterogeneity and Cas9 sequence diversity, resulting in the identification of various type II CRISPR-Cas subgroups. We validated the expression and co-processing of predicted tracrRNAs and pre-crRNAs by RNA sequencing in five bacterial species. This study reveals tracrRNA family as an atypical, small RNA family with no obvious conservation of structure, sequence or localization within type II CRISPR-Cas loci. The tracrRNA family is however characterized by the conserved feature to base-pair to cognate pre-crRNA repeats, an essential function for crRNA maturation and DNA silencing by dual-RNA:Cas9. The large panel of tracrRNA and Cas9 ortholog sequences should constitute a useful database to improve the design of RNA-programmable Cas9 as genome editing tool.
tracrRNA; CRISPR-Cas; type II system; Cas9 (Csn1); RNA processing; RNA maturation; small non-coding RNA; bacteria; adaptive immunity; mobile genetic elements
The CRISPR arrays found in many bacteria and most archaea are transcribed into a long precursor RNA that is processed into small clustered regularly interspaced short palindromic repeats (CRISPR) RNAs (crRNAs). These RNA molecules can contain fragments of viral genomes and mediate, together with a set of CRISPR-associated (Cas) proteins, the prokaryotic immunity against viral attacks. CRISPR/Cas systems are diverse and the Cas6 enzymes that process crRNAs vary between different subtypes. We analysed CRISPR/Cas subtype I-B and present the identification of novel Cas6 enzymes from the bacterial and archaeal model organisms Clostridium thermocellum and Methanococcus maripaludis C5. Methanococcus maripaludis Cas6b in vitro activity and specificity was determined. Two complementary catalytic histidine residues were identified. RNA-Seq analyses revealed in vivo crRNA processing sites, crRNA abundance and orientation of CRISPR transcription within these two organisms. Individual spacer sequences were identified with strong effects on transcription and processing patterns of a CRISPR cluster. These effects will need to be considered for the application of CRISPR clusters that are designed to produce synthetic crRNAs.
Using the hyperthermophile Pyrococcus furiosus, we have delineated several key steps in CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) invader defence pathways. P. furiosus has seven transcriptionally active CRISPR loci that together encode a total of 200 crRNAs (CRISPR RNAs). The 27 Cas proteins in this organism represent three distinct pathways and are primarily encoded in two large gene clusters. The Cas6 protein dices CRISPR locus transcripts to generate individual invader-targeting crRNAs. The mature crRNAs include a signature sequence element (the 5′ tag) derived from the CRISPR locus repeat sequence that is important for function. crRNAs are tailored into distinct species and integrated into three distinct crRNA–Cas protein complexes that are all candidate effector complexes. The complex formed by the Cmr [Cas module RAMP (repeat-associated mysterious proteins)] (subtype III-B) proteins cleaves complementary target RNAs and can be programmed to cleave novel target RNAs in a prokaryotic RNAi-like manner. Evidence suggests that the other two CRISPR–Cas systems in P. furiosus, Csa (Cas subtype Apern) (subtype I-A) and Cst (Cas subtype Tneap) (subtype I-B), target invaders at the DNA level. Studies of the CRISPR–Cas systems from P. furiosus are yielding fundamental knowledge of mechanisms of crRNA biogenesis and silencing for three of the diverse CRISPR–Cas pathways, and reveal that organisms such as P. furiosus possess an arsenal of multiple RNA-guided mechanisms to resist diverse invaders. Our knowledge of the fascinating CRISPR–Cas pathways is leading in turn to our ability to co-opt these systems for exciting new biomedical and biotechnological applications.
clustered regularly interspaced short palindromic repeats (CRISPR); CRISPR-associated (Cas); non-coding RNA; prokaryotic immunity; Pyrococcus furiosus; virus
To fend off foreign genetic elements, prokaryotes have developed several defense systems. The most recently discovered defense system, CRISPR/Cas, is sequence-specific, adaptive and heritable. The two central components of this system are the Cas proteins and the CRISPR RNA. The latter consists of repeat sequences that are interspersed with spacer sequences. The CRISPR locus is transcribed into a precursor RNA that is subsequently processed into short crRNAs. CRISPR/Cas systems have been identified in bacteria and archaea, and data show that many variations of this system exist. We analyzed the requirements for a successful defense reaction in the halophilic archaeon Haloferax volcanii. Haloferax encodes a CRISPR/Cas system of the I-B subtype, about which very little is known. Analysis of the mature crRNAs revealed that they contain a spacer as their central element, which is preceded by an eight-nucleotide-long 5′ handle that originates from the upstream repeat. The repeat sequences have the potential to fold into a minimal stem loop. Sequencing of the crRNA population indicated that not all of the spacers that are encoded by the three CRISPR loci are present in the same abundance. By challenging Haloferax with an invader plasmid, we demonstrated that the interaction of the crRNA with the invader DNA requires a 10-nucleotide-long seed sequence. In addition, we found that not all of the crRNAs from the three CRISPR loci are effective at triggering the degradation of invader plasmids. The interference does not seem to be influenced by the copy number of the invader plasmid.
archaea; Haloferax volcanii; CRISPR/Cas; crRNA; PAM; seed sequence
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
The Cas9-crRNA complex of the Streptococcus thermophilus DGCC7710 CRISPR3-Cas system functions as an RNA-guided endonuclease with crRNA-directed target sequence recognition and protein-mediated DNA cleavage. We show here that an additional RNA molecule, tracrRNA (trans-activating CRISPR RNA), co-purifies with the Cas9 protein isolated from the heterologous E. coli strain carrying the S. thermophilus DGCC7710 CRISPR3-Cas system. We provide experimental evidence that tracrRNA is required for Cas9-mediated DNA interference both in vitro and in vivo. We show that Cas9 specifically promotes duplex formation between the precursor crRNA (pre-crRNA) transcript and tracrRNA, in vitro. Furthermore, the housekeeping RNase III contributes to primary pre-crRNA-tracrRNA duplex cleavage for mature crRNA biogenesis. RNase III, however, is not required in the processing of a short pre-crRNA transcribed from a minimal CRISPR array containing a single spacer. Finally, we show that an in vitro-assembled ternary Cas9-crRNA-tracrRNA complex cleaves DNA. This study further specifies the molecular basis for crRNA-based re-programming of Cas9 to specifically cleave any target DNA sequence for precise genome surgery. The processes for crRNA maturation and effector complex assembly established here will contribute to the further development of the Cas9 re-programmable system for genome editing applications.
CRISPR; DNA silencing; Type II CRISPR-Cas systems
The CRISPR-Cas (Clustered Regularly Interspaced Short Palindrome Repeats – CRISPR associated proteins) system provides adaptive immunity in archaea and bacteria. A hallmark of CRISPR-Cas is the involvement of short crRNAs that guide associated proteins in the destruction of invading DNA or RNA. We present three fundamentally distinct processing pathways in the cyanobacterium Synechocystis sp. PCC6803 for a subtype I-D (CRISPR1), and two type III systems (CRISPR2 and CRISPR3), which are located together on the plasmid pSYSA. Using high-throughput transcriptome analyses and assays of transcript accumulation we found all CRISPR loci to be highly expressed, but the individual crRNAs had profoundly varying abundances despite single transcription start sites for each array. In a computational analysis, CRISPR3 spacers with stable secondary structures displayed a greater ratio of degradation products. These structures might interfere with the loading of the crRNAs into RNP complexes, explaining the varying abundancies. The maturation of CRISPR1 and CRISPR2 transcripts depends on at least two different Cas6 proteins. Mutation of gene sll7090, encoding a Cmr2 protein led to the disappearance of all CRISPR3-derived crRNAs, providing in vivo evidence for a function of Cmr2 in the maturation, regulation of expression, Cmr complex formation or stabilization of CRISPR3 transcripts. Finally, we optimized CRISPR repeat structure prediction and the results indicate that the spacer context can influence individual repeat structures.
CRISPR-Cas systems are RNA-guided immune systems that protect prokaryotes against viruses and other invaders. The CRISPR locus encodes crRNAs that recognize invading nucleic acid sequences and trigger silencing by the associated Cas proteins. There are multiple CRISPR-Cas systems with distinct compositions and mechanistic processes. Thermococcus kodakarensis (Tko) is a hyperthermophilic euryarchaeon that has both a Type I-A Csa and a Type I-B Cst CRISPR-Cas system. We have analyzed the expression and composition of crRNAs from the three CRISPRs in Tko by RNA deep sequencing and northern analysis. Our results indicate that crRNAs associated with these two CRISPR-Cas systems include an 8-nucleotide conserved sequence tag at the 5′ end. We challenged Tko with plasmid invaders containing sequences targeted by endogenous crRNAs and observed active CRISPR-Cas-mediated silencing. Plasmid silencing was dependent on complementarity with a crRNA as well as on a sequence element found immediately adjacent to the crRNA recognition site in the target termed the PAM (protospacer adjacent motif). Silencing occurred independently of the orientation of the target sequence in the plasmid, and appears to occur at the DNA level, presumably via DNA degradation. In addition, we have directed silencing of an invader plasmid by genetically engineering the chromosomal CRISPR locus to express customized crRNAs directed against the plasmid. Our results support CRISPR engineering as a feasible approach to develop prokaryotic strains that are resistant to infection for use in industry.
CRISPR; Cas; archaea; Thermococcus; hyperthermophile; immune; RNA; DNA; silencing; interference
Discriminating self and non-self is a universal requirement of immune systems. Adaptive immune systems in prokaryotes are centered around repetitive loci called CRISPRs (clustered regularly interspaced short palindromic repeat), into which invader DNA fragments are incorporated. CRISPR transcripts are processed into small RNAs that guide CRISPR-associated (Cas) proteins to invading nucleic acids by complementary base pairing. However, to avoid autoimmunity it is essential that these RNA-guides exclusively target invading DNA and not complementary DNA sequences (i.e., self-sequences) located in the host's own CRISPR locus. Previous work on the Type III-A CRISPR system from Staphylococcus epidermidis has demonstrated that a portion of the CRISPR RNA-guide sequence is involved in self versus non-self discrimination. This self-avoidance mechanism relies on sensing base pairing between the RNA-guide and sequences flanking the target DNA. To determine if the RNA-guide participates in self versus non-self discrimination in the Type I-E system from Escherichia coli we altered base pairing potential between the RNA-guide and the flanks of DNA targets. Here we demonstrate that Type I-E systems discriminate self from non-self through a base pairing-independent mechanism that strictly relies on the recognition of four unchangeable PAM sequences. In addition, this work reveals that the first base pair between the guide RNA and the PAM nucleotide immediately flanking the target sequence can be disrupted without affecting the interference phenotype. Remarkably, this indicates that base pairing at this position is not involved in foreign DNA recognition. Results in this paper reveal that the Type I-E mechanism of avoiding self sequences and preventing autoimmunity is fundamentally different from that employed by Type III-A systems. We propose the exclusive targeting of PAM-flanked sequences to be termed a target versus non-target discrimination mechanism.
CRISPR loci and their associated genes form a diverse set of adaptive immune systems that are widespread among prokaryotes. In these systems, the CRISPR-associated genes (cas) encode for proteins that capture fragments of invading DNA and integrate these sequences between repeat sequences of the host's CRISPR locus. This information is used upon re-infection to degrade invader genomes. Storing invader sequences in host genomes necessitates a mechanism to differentiate between invader sequences on invader genomes and invader sequences on the host genome. CRISPR-Cas of Staphylococcus epidermidis (Type III-A system) is inhibited when invader sequences are flanked by repeat sequences, and this prevents targeting of the CRISPR locus on the host genome. Here we demonstrate that Escherichia coli CRISPR-Cas (Type I-E system) is not inhibited by repeat sequences. Instead, this system is specifically activated by the presence of bona fide Protospacer Adjacent Motifs (PAMs) in the target. PAMs are conserved sequences adjoining invader sequences on the invader genome, and these sequences are never adjacent to invader sequences within host CRISPR loci. PAM recognition is not affected by base pairing potential of the target with the crRNA. As such, the Type I-E system lacks the ability to specifically recognize self DNA.
The prokaryotic Clusters of Regularly Interspaced Palindromic Repeats (CRISPR) system utilizes genomically-encoded CRISPR RNA (crRNA), derived from invading viruses and incorporated into ribonucleoprotein complexes with CRISPR-associated (CAS) proteins, to target and degrade viral DNA or RNA on subsequent infection. RNA is targeted by the CMR complex. In Sulfolobus solfataricus, this complex is composed of seven CAS protein subunits (Cmr1-7) and carries a diverse “payload” of targeting crRNA. The crystal structure of Cmr7 and low resolution structure of the complex are presented. S. solfataricus CMR cleaves RNA targets in an endonucleolytic reaction at UA dinucleotides. This activity is dependent on the 8-nucleotide repeat-derived 5′ sequence in the crRNA, but not on the presence of a proto-spacer associated motif (PAM) in the target. Both target and guide RNAs can be cleaved, although a single molecule of guide RNA can support the degradation of multiple targets.
CRISPR/Cas systems constitute a widespread class of immunity systems that protect bacteria and archaea against phages and plasmids, and commonly use repeat/spacer-derived short crRNAs to silence foreign nucleic acids in a sequence-specific manner. Although the maturation of crRNAs represents a key event in CRISPR activation, the responsible endoribonucleases (CasE, Cas6, Csy4) are missing in many CRISPR/Cas subtypes. Here, differential RNA sequencing of the human pathogen Streptococcus pyogenes uncovered tracrRNA, a trans-encoded small RNA with 24 nucleotide complementarity to the repeat regions of crRNA precursor transcripts. We show that tracrRNA directs the maturation of crRNAs by the activities of the widely conserved endogenous RNase III and the CRISPR-associated Csn1 protein; all these components are essential to protect S. pyogenes against prophage-derived DNA. Our study reveals a novel pathway of small guide RNA maturation and the first example of a host factor (RNase III) required for bacterial RNA-mediated immunity against invaders.
Prokaryotic immunity against foreign nucleic acids mediated by clustered, regularly interspaced, short palindromic repeats (CRISPR) depends on the expression of the CRISPR-associated (Cas) proteins and the formation of small CRISPR RNAs (crRNAs). The crRNA-loaded Cas ribonucleoprotein complexes convey the specific recognition and inactivation of target nucleic acids. In E. coli K12, the maturation of crRNAs and the interference with target DNA is performed by the Cascade complex. The transcription of the Cascade operon is tightly repressed through H-NS-dependent inhibition of the Pcas promoter. Elevated levels of the LysR-type regulator LeuO induce the Pcas promoter and concomitantly activate the CRISPR-mediated immunity against phages. Here, we show that the Pcas promoter can also be induced by constitutive expression of the regulator BglJ. This activation is LeuO-dependent as heterodimers of BglJ and RcsB activate leuO transcription. Each transcription factor, LeuO or BglJ, induced the transcription of the Cascade genes to comparable amounts. However, the maturation of the crRNAs was activated in LeuO but not in BglJ-expressing cells. Studies on CRISPR promoter activities, transcript stabilities, crRNA processing and Cascade protein levels were performed to answer the question why crRNA maturation is defective in BglJ-expressing cells. Our results demonstrate that the activation of Cascade gene transcription is necessary but not sufficient to turn on the CRISPR-mediated immunity and suggest a more complex regulation of the type I-E CRISPR-Cas system in E. coli.
CRISPR; Cas protein; Cascade; H-NS; LeuO; transcription regulation
The clustered regularly interspaced short palindromic repeats (CRISPR) system represents a highly adaptive and heritable defense system against foreign nucleic acids in bacteria and archaea. We analyzed the two CRISPR-Cas systems in Methanosarcina mazei strain Gö1. Although belonging to different subtypes (I-B and III-B), the leaders and repeats of both loci are nearly identical. Also, despite many point mutations in each array, a common hairpin motif was identified in the repeats by a bioinformatics analysis and in vitro structural probing. The expression and maturation of CRISPR-derived RNAs (crRNAs) were studied in vitro and in vivo. Both respective potential Cas6b-type endonucleases were purified and their activity tested in vitro. Each protein showed significant activity and could cleave both repeats at the same processing site. Cas6b of subtype III-B, however, was significantly more efficient in its cleavage activity compared with Cas6b of subtype I-B. Northern blot and differential RNAseq analyses were performed to investigate in vivo transcription and maturation of crRNAs, revealing generally very low expression of both systems, whereas significant induction at high NaCl concentrations was observed. crRNAs derived proximal to the leader were generally more abundant than distal ones and in vivo processing sites were clarified for both loci, confirming the previously well-established 8 nt 5′ repeat tags. The 3′-ends were more diverse, but generally ended in a prefix of the following repeat sequence (3′-tag). The analysis further revealed a 5′-hydroxy and 3′-phosphate termini architecture of small crRNAs specific for cleavage products of Cas6 endonucleases from type I-E and I-F and type III-B.
methanoarchaea; CRISPR-Cas system; immunity of prokaryotes; regulatory RNA; phages; Methanosarcina mazei
CRISPR loci are essential components of the adaptive immune system of archaea and bacteria. They consist of long arrays of repeats separated by DNA spacers encoding guide RNAs (crRNA), which target foreign genetic elements. Cbp1 (CRISPR DNA repeat binding protein) binds specifically to the multiple direct repeats of CRISPR loci of members of the acidothermophilic, crenarchaeal order Sulfolobales. cbp1 gene deletion from Sulfolobus islandicus REY15A produced a strong reduction in pre-crRNA yields from CRISPR loci but did not inhibit the foreign DNA targeting capacity of the CRISPR/Cas system. Conversely, overexpression of Cbp1 in S. islandicus generated an increase in pre-crRNA yields while the level of reverse strand transcripts from CRISPR loci remained unchanged. It is proposed that Cbp1 modulates production of longer pre-crRNA transcripts from CRISPR loci. A possible mechanism is that it minimizes interference from potential transcriptional signals carried on spacers deriving from A-T-rich genetic elements and, occasionally, on DNA repeats. Supporting evidence is provided by microarray and northern blotting analyses, and publicly available whole-transcriptome data for S. solfataricus P2.
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism.
CRISPR-spacer acquisition; Cascade; Escherichia coli K12; O157:H7; RNA-guided immunity; cas genes; protospacer adjacent motif; reporter plasmids; ruler mechanism; spacer orientation
Bacteria and Archaea encode clustered, regularly interspaced, short palindromic repeat (CRISPR) systems to confer adaptive immunity to invasive viruses and plasmids. Recent studies of CRISPR systems revealed that diverse CRISPR-associated (Cas) interference modules often coexist in different organisms but functions of cas genes have not been dissected in any of these systems. The crenarchaeon Sulfolobus islandicus encodes three distinct CRISPR interference modules, including a type IA system and two type IIIB systems: Cmr-α and Cmr-β. To study the genetic determinants of protospacer-adjacent motif (PAM)-dependent DNA targeting activity and mature CRISPR RNA (crRNA) production in this organism, mutants deleting individual genes of the type IA system or removing each of other Cas modules were constructed. Characterization of these mutants revealed that Cas7, Cas5, Cas6, Cas3′ and Cas3” are essential for PAM-dependent DNA targeting activity, whereas Csa5, along with all other Cas modules, is dispensable for the targeting in the crenarchaeon. Cas6 is implicated as the only enzyme for pre-crRNA processing and the crRNA maturation is independent of the DNA targeting activity. Importantly, we show that Cas7 and Cas5 are essential for stabilizing the processing intermediates and mature crRNAs, respectively, and that depleting the helicase or nuclease domain of Cas3 leads to the accumulation of processing intermediates. This demonstrates that in addition to Cas6, other Cas proteins of an archaeal type IA system also contribute to crRNA processing.
CRISPR/Cas; Cascade; protospacer-adjacent motifs; crRNA biogenesis; DNA interference; cas mutants; Sulfolobus islandicus
CRISPR-Cas is an adaptive prokaryotic immune system, providing protection against viruses and other mobile genetic elements. In type I and type III CRISPR-Cas systems, CRISPR RNA (crRNA) is generated by cleavage of a primary transcript by the Cas6 endonuclease and loaded into multisubunit surveillance/effector complexes, allowing homology-directed detection and cleavage of invading elements. Highly studied CRISPR-Cas systems such as those in Escherichia coli and Pseudomonas aeruginosa have a single Cas6 enzyme that is an integral subunit of the surveillance complex. By contrast, Sulfolobus solfataricus has a complex CRISPR-Cas system with three types of surveillance complexes (Cascade/type I-A, CSM/type III-A and CMR/type III-B), five Cas6 paralogues and two different CRISPR-repeat families (AB and CD). Here, we investigate the kinetic properties of two different Cas6 paralogues from S. solfataricus. The Cas6-1 subtype is specific for CD-family CRISPR repeats, generating crRNA by multiple turnover catalysis whilst Cas6-3 has a broader specificity and also processes a non-coding RNA with a CRISPR repeat-related sequence. Deep sequencing of crRNA in surveillance complexes reveals a biased distribution of spacers derived from AB and CD loci, suggesting functional coupling between Cas6 paralogues and their downstream effector complexes.
Motivation: The discovery of CRISPR-Cas systems almost 20 years ago rapidly changed our perception of the bacterial and archaeal immune systems. CRISPR loci consist of several repetitive DNA sequences called repeats, inter-spaced by stretches of variable length sequences called spacers. This CRISPR array is transcribed and processed into multiple mature RNA species (crRNAs). A single crRNA is integrated into an interference complex, together with CRISPR-associated (Cas) proteins, to bind and degrade invading nucleic acids. Although existing bioinformatics tools can recognize CRISPR loci by their characteristic repeat-spacer architecture, they generally output CRISPR arrays of ambiguous orientation and thus do not determine the strand from which crRNAs are processed. Knowledge of the correct orientation is crucial for many tasks, including the classification of CRISPR conservation, the detection of leader regions, the identification of target sites (protospacers) on invading genetic elements and the characterization of protospacer-adjacent motifs.
Results: We present a fast and accurate tool to determine the crRNA-encoding strand at CRISPR loci by predicting the correct orientation of repeats based on an advanced machine learning approach. Both the repeat sequence and mutation information were encoded and processed by an efficient graph kernel to learn higher-order correlations. The model was trained and tested on curated data comprising >4500 CRISPRs and yielded a remarkable performance of 0.95 AUC ROC (area under the curve of the receiver operator characteristic). In addition, we show that accurate orientation information greatly improved detection of conserved repeat sequence families and structure motifs. We integrated CRISPRstrand predictions into our CRISPRmap web server of CRISPR conservation and updated the latter to version 2.0.
Availability: CRISPRmap and CRISPRstrand are available at http://rna.informatik.uni-freiburg.de/CRISPRmap.
Supplementary data are available at Bioinformatics online.
Clustered regularly interspaced short palindromic repeats (CRISPRs), together with an operon of CRISPR-associated (Cas) proteins, form an RNA-based prokaryotic immune system against exogenous genetic elements. Cas5 family proteins are found in several Type I CRISPR-Cas systems. Here we report the molecular function of Subtype I-C/Dvulg Cas5d from B. halodurans. We show that Cas5d cleaves pre-crRNA into unit length by recognizing both the hairpin structure and the 3′ single stranded sequence in the CRISPR repeat region. Cas5d structure reveals a ferredoxin domain-based architecture and a catalytic triad formed by Y46, K116 and H117 residues. We further show that after pre-crRNA processing, Cas5d assembles with crRNA, Csd1, and Csd2 proteins to form a multi-subunit interference complex similar to E. coli Cascade (CRISPR-associated complex for antiviral defense) in architecture. Our results suggest that formation of a crRNA-presenting Cascade-like complex is likely a common theme among Type I CRISPR subtypes.
CRISPR interference confers adaptive, sequence-based immunity against viruses and plasmids and is specified by CRISPR RNAs (crRNAs) that are transcribed and processed from spacer-repeat units. Pre-crRNA processing is essential for CRISPR interference in all systems studied thus far. Here, our studies of crRNA biogenesis and CRISPR interference in naturally competent Neisseria spp. reveal a unique crRNA maturation pathway in which crRNAs are transcribed from promoters that are embedded within each repeat, yielding crRNA 5’ ends formed by transcription and not by processing. Although crRNA 3’ end formation involves RNase III and trans-encoded tracrRNA, as in other Type II CRISPR systems, this processing is dispensable for interference. The meningococcal pathway is the most streamlined CRISPR/cas system characterized to date. Endogenous CRISPR spacers limit natural transformation, which is the primary source of genetic variation that contributes to immune evasion, antibiotic resistance, and virulence in the human pathogen N. meningitidis.
Neisseria; CRISPR/Cas; dRNA-seq; RNA surveillance system; pathogen; RNA processing; crRNA; Cas9; Type II-C