The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1)1-3. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12 .
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4.
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology; Issue 67; CRISPR/Cas; endonuclease; in vitro transcription; crRNA; Cas6
Using the hyperthermophile Pyrococcus furiosus, we have delineated several key steps in CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) invader defence pathways. P. furiosus has seven transcriptionally active CRISPR loci that together encode a total of 200 crRNAs (CRISPR RNAs). The 27 Cas proteins in this organism represent three distinct pathways and are primarily encoded in two large gene clusters. The Cas6 protein dices CRISPR locus transcripts to generate individual invader-targeting crRNAs. The mature crRNAs include a signature sequence element (the 5′ tag) derived from the CRISPR locus repeat sequence that is important for function. crRNAs are tailored into distinct species and integrated into three distinct crRNA–Cas protein complexes that are all candidate effector complexes. The complex formed by the Cmr [Cas module RAMP (repeat-associated mysterious proteins)] (subtype III-B) proteins cleaves complementary target RNAs and can be programmed to cleave novel target RNAs in a prokaryotic RNAi-like manner. Evidence suggests that the other two CRISPR–Cas systems in P. furiosus, Csa (Cas subtype Apern) (subtype I-A) and Cst (Cas subtype Tneap) (subtype I-B), target invaders at the DNA level. Studies of the CRISPR–Cas systems from P. furiosus are yielding fundamental knowledge of mechanisms of crRNA biogenesis and silencing for three of the diverse CRISPR–Cas pathways, and reveal that organisms such as P. furiosus possess an arsenal of multiple RNA-guided mechanisms to resist diverse invaders. Our knowledge of the fascinating CRISPR–Cas pathways is leading in turn to our ability to co-opt these systems for exciting new biomedical and biotechnological applications.
clustered regularly interspaced short palindromic repeats (CRISPR); CRISPR-associated (Cas); non-coding RNA; prokaryotic immunity; Pyrococcus furiosus; virus
Bacteria and Archaea encode clustered, regularly interspaced, short palindromic repeat (CRISPR) systems to confer adaptive immunity to invasive viruses and plasmids. Recent studies of CRISPR systems revealed that diverse CRISPR-associated (Cas) interference modules often coexist in different organisms but functions of cas genes have not been dissected in any of these systems. The crenarchaeon Sulfolobus islandicus encodes three distinct CRISPR interference modules, including a type IA system and two type IIIB systems: Cmr-α and Cmr-β. To study the genetic determinants of protospacer-adjacent motif (PAM)-dependent DNA targeting activity and mature CRISPR RNA (crRNA) production in this organism, mutants deleting individual genes of the type IA system or removing each of other Cas modules were constructed. Characterization of these mutants revealed that Cas7, Cas5, Cas6, Cas3′ and Cas3” are essential for PAM-dependent DNA targeting activity, whereas Csa5, along with all other Cas modules, is dispensable for the targeting in the crenarchaeon. Cas6 is implicated as the only enzyme for pre-crRNA processing and the crRNA maturation is independent of the DNA targeting activity. Importantly, we show that Cas7 and Cas5 are essential for stabilizing the processing intermediates and mature crRNAs, respectively, and that depleting the helicase or nuclease domain of Cas3 leads to the accumulation of processing intermediates. This demonstrates that in addition to Cas6, other Cas proteins of an archaeal type IA system also contribute to crRNA processing.
CRISPR/Cas; Cascade; protospacer-adjacent motifs; crRNA biogenesis; DNA interference; cas mutants; Sulfolobus islandicus
The CRISPR-Cas (Clustered Regularly Interspaced Short Palindrome Repeats – CRISPR associated proteins) system provides adaptive immunity in archaea and bacteria. A hallmark of CRISPR-Cas is the involvement of short crRNAs that guide associated proteins in the destruction of invading DNA or RNA. We present three fundamentally distinct processing pathways in the cyanobacterium Synechocystis sp. PCC6803 for a subtype I-D (CRISPR1), and two type III systems (CRISPR2 and CRISPR3), which are located together on the plasmid pSYSA. Using high-throughput transcriptome analyses and assays of transcript accumulation we found all CRISPR loci to be highly expressed, but the individual crRNAs had profoundly varying abundances despite single transcription start sites for each array. In a computational analysis, CRISPR3 spacers with stable secondary structures displayed a greater ratio of degradation products. These structures might interfere with the loading of the crRNAs into RNP complexes, explaining the varying abundancies. The maturation of CRISPR1 and CRISPR2 transcripts depends on at least two different Cas6 proteins. Mutation of gene sll7090, encoding a Cmr2 protein led to the disappearance of all CRISPR3-derived crRNAs, providing in vivo evidence for a function of Cmr2 in the maturation, regulation of expression, Cmr complex formation or stabilization of CRISPR3 transcripts. Finally, we optimized CRISPR repeat structure prediction and the results indicate that the spacer context can influence individual repeat structures.
CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated sequences) is a recently discovered prokaryotic defense system against foreign DNA, including viruses and plasmids. CRISPR cassette is transcribed as a continuous transcript (pre-crRNA), which is processed by Cas proteins into small RNA molecules (crRNAs) that are responsible for defense against invading viruses. Experiments in E. coli report that overexpression of cas genes generates a large number of crRNAs, from only few pre-crRNAs.
We here develop a minimal model of CRISPR processing, which we parameterize based on available experimental data. From the model, we show that the system can generate a large amount of crRNAs, based on only a small decrease in the amount of pre-crRNAs. The relationship between the decrease of pre-crRNAs and the increase of crRNAs corresponds to strong linear amplification. Interestingly, this strong amplification crucially depends on fast non-specific degradation of pre-crRNA by an unidentified nuclease. We show that overexpression of cas genes above a certain level does not result in further increase of crRNA, but that this saturation can be relieved if the rate of CRISPR transcription is increased. We furthermore show that a small increase of CRISPR transcription rate can substantially decrease the extent of cas gene activation necessary to achieve a desired amount of crRNA.
The simple mathematical model developed here is able to explain existing experimental observations on CRISPR transcript processing in Escherichia coli. The model shows that a competition between specific pre-crRNA processing and non-specific degradation determines the steady-state levels of crRNA and is responsible for strong linear amplification of crRNAs when cas genes are overexpressed. The model further shows how disappearance of only a few pre-crRNA molecules normally present in the cell can lead to a large (two orders of magnitude) increase of crRNAs upon cas overexpression. A crucial ingredient of this large increase is fast non-specific degradation by an unspecified nuclease, which suggests that a yet unidentified nuclease(s) is a major control element of CRISPR response. Transcriptional regulation may be another important control mechanism, as it can either increase the amount of generated pre-crRNA, or alter the level of cas gene activity.
This article was reviewed by Mikhail Gelfand, Eugene Koonin and L Aravind.
CRISPR/Cas; Transcript processing; Small RNA; CRISPR expression regulation; CRISPR/Cas response
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
The CRISPR arrays found in many bacteria and most archaea are transcribed into a long precursor RNA that is processed into small clustered regularly interspaced short palindromic repeats (CRISPR) RNAs (crRNAs). These RNA molecules can contain fragments of viral genomes and mediate, together with a set of CRISPR-associated (Cas) proteins, the prokaryotic immunity against viral attacks. CRISPR/Cas systems are diverse and the Cas6 enzymes that process crRNAs vary between different subtypes. We analysed CRISPR/Cas subtype I-B and present the identification of novel Cas6 enzymes from the bacterial and archaeal model organisms Clostridium thermocellum and Methanococcus maripaludis C5. Methanococcus maripaludis Cas6b in vitro activity and specificity was determined. Two complementary catalytic histidine residues were identified. RNA-Seq analyses revealed in vivo crRNA processing sites, crRNA abundance and orientation of CRISPR transcription within these two organisms. Individual spacer sequences were identified with strong effects on transcription and processing patterns of a CRISPR cluster. These effects will need to be considered for the application of CRISPR clusters that are designed to produce synthetic crRNAs.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR-associated (Cas) systems of type I use a Cas ribonucleoprotein complex for antiviral defense (Cascade) to mediate the targeting and degradation of foreign DNA. To address molecular features of the archaeal type I-A Cascade interference mechanism, we established the in vitro assembly of the Thermoproteus tenax Cascade from six recombinant Cas proteins, synthetic CRISPR RNAs (crRNAs) and target DNA fragments. RNA-Seq analyses revealed the processing pattern of crRNAs from seven T. tenax CRISPR arrays. Synthetic crRNA transcripts were matured by hammerhead ribozyme cleavage. The assembly of type I-A Cascade indicates that Cas3′ and Cas3′′ are an integral part of the complex, and the interference activity was shown to be dependent on the crRNA and the matching target DNA. The reconstituted Cascade was used to identify sequence motifs that are required for efficient DNA degradation and to investigate the role of the subunits Cas7 and Cas3′′ in the interplay with other Cascade subunits.
All immune systems must distinguish self from non-self to repel invaders without inducing autoimmunity. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci protect bacteria and archaea from invasion by phage and plasmid DNA through a genetic interference pathway1–9. CRISPR loci are present in ~ 40% and ~90% of sequenced bacterial and archaeal genomes respectively10 and evolve rapidly, acquiring new spacer sequences to adapt to highly dynamic viral populations1, 11–13. Immunity requires a sequence match between the invasive DNA and the spacers that lie between CRISPR repeats1–9. Each cluster is genetically linked to a subset of the cas (CRISPR-associated) genes14–16 that collectively encode >40 families of proteins involved in adaptation and interference. CRISPR loci encode small CRISPR RNAs (crRNAs) that contain a full spacer flanked by partial repeat sequences2, 17–19. CrRNA spacers are thought to identify targets by direct Watson-Crick pairing with invasive “protospacer” DNA2, 3, but how they avoid targeting the spacer DNA within the encoding CRISPR locus itself is unknown. Here we have defined the mechanism of CRISPR self/non-self discrimination. In Staphylococcus epidermidis, target/crRNA mismatches at specific positions outside of the spacer sequence license foreign DNA for interference, whereas extended pairing between crRNA and CRISPR DNA repeats prevents autoimmunity. Hence, this CRISPR system uses the base-pairing potential of crRNAs not only to specify a target but also to spare the bacterial chromosome from interference. Differential complementarity outside of the spacer sequence is a built-in feature of all CRISPR systems, suggesting that this mechanism is a broadly applicable solution to the self/non-self dilemma that confronts all immune pathways.
Type I CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)–Cas (CRISPR-associated) systems exist in bacterial and archaeal organisms and provide immunity against foreign DNA. The Cas protein content of the DNA interference complexes (termed Cascade) varies between different CRISPR-Cas subtypes. A minimal variant of the Type I-F system was identified in proteobacterial species including Shewanella putrefaciens CN-32. This variant lacks a large subunit (Csy1), Csy2 and Csy3 and contains two unclassified cas genes. The genome of S. putrefaciens CN-32 contains only five Cas proteins (Cas1, Cas3, Cas6f, Cas1821 and Cas1822) and a single CRISPR array with 81 spacers. RNA-Seq analyses revealed the transcription of this array and the maturation of crRNAs (CRISPR RNAs). Interference assays based on plasmid conjugation demonstrated that this CRISPR-Cas system is active in vivo and that activity is dependent on the recognition of the dinucleotide GG PAM (Protospacer Adjacent Motif) sequence and crRNA abundance. The deletion of cas1821 and cas1822 reduced the cellular crRNA pool. Recombinant Cas1821 was shown to form helical filaments bound to RNA molecules, which suggests its role as the Cascade backbone protein. A Cascade complex was isolated which contained multiple Cas1821 copies, Cas1822, Cas6f and mature crRNAs.
Bacteria and archaea face continual onslaughts of rapidly diversifying viruses and plasmids. Many prokaryotes maintain adaptive immune systems known as clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (Cas). CRISPR-Cas systems are genomic sensors that serially acquire viral and plasmid DNA fragments (spacers) that are utilized to target and cleave matching viral and plasmid DNA in subsequent genomic invasions, offering critical immunological memory. Only 50% of sequenced bacteria possess CRISPR-Cas immunity, in contrast to over 90% of sequenced archaea. To probe why half of bacteria lack CRISPR-Cas immunity, we combined comparative genomics and mathematical modeling. Analysis of hundreds of diverse prokaryotic genomes shows that CRISPR-Cas systems are substantially more prevalent in thermophiles than in mesophiles. With sequenced bacteria disproportionately mesophilic and sequenced archaea mostly thermophilic, the presence of CRISPR-Cas appears to depend more on environmental temperature than on bacterial-archaeal taxonomy. Mutation rates are typically severalfold higher in mesophilic prokaryotes than in thermophilic prokaryotes. To quantitatively test whether accelerated viral mutation leads microbes to lose CRISPR-Cas systems, we developed a stochastic model of virus-CRISPR coevolution. The model competes CRISPR-Cas-positive (CRISPR-Cas+) prokaryotes against CRISPR-Cas-negative (CRISPR-Cas−) prokaryotes, continually weighing the antiviral benefits conferred by CRISPR-Cas immunity against its fitness costs. Tracking this cost-benefit analysis across parameter space reveals viral mutation rate thresholds beyond which CRISPR-Cas cannot provide sufficient immunity and is purged from host populations. These results offer a simple, testable viral diversity hypothesis to explain why mesophilic bacteria disproportionately lack CRISPR-Cas immunity. More generally, fundamental limits on the adaptability of biological sensors (Lamarckian evolution) are predicted.
A remarkable recent discovery in microbiology is that bacteria and archaea possess systems conferring immunological memory and adaptive immunity. Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (CRISPR-Cas) are genomic sensors that allow prokaryotes to acquire DNA fragments from invading viruses and plasmids. Providing immunological memory, these stored fragments destroy matching DNA in future viral and plasmid invasions. CRISPR-Cas systems also provide adaptive immunity, keeping up with mutating viruses and plasmids by continually acquiring new DNA fragments. Surprisingly, less than 50% of mesophilic bacteria, in contrast to almost 90% of thermophilic bacteria and Archaea, maintain CRISPR-Cas immunity. Using mathematical modeling, we probe this dichotomy, showing how increased viral mutation rates can explain the reduced prevalence of CRISPR-Cas systems in mesophiles. Rapidly mutating viruses outrun CRISPR-Cas immune systems, likely decreasing their prevalence in bacterial populations. Thus, viral adaptability may select against, rather than for, immune adaptability in prokaryotes.
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
The family of clustered regularly interspaced short palindromic repeats (CRISPRs) describes a class of DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. The DNA repeats do not encode proteins, but appear to be transcribed and processed into small RNAs that may have any number of functions, including resistance to any phage (i.e., virus of bacteria) whose sequence matches a spacer; spacers change rapidly as microbial strains evolve. This work describes 41 new CRISPR-associated (cas) gene families, which are always found near these repeats, in addition to the four previously known. It shows that CRISPR systems belong to different classes, with different repeat patterns, sets of genes, and species ranges. Most of these seem to come and go rather rapidly from their host genomes. These possibly beneficial mobile genetic elements may play an important role in driving prokaryotic evolution.
Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) are widely distributed among bacteria. These systems provide adaptive immunity against mobile genetic elements specified by the spacer sequences stored within the CRISPR.
The CRISPR-Cas system has been identified using Basic Local Alignment Search Tool (BLAST) against other sequenced and annotated genomes and confirmed via CRISPRfinder program. Using Polymerase Chain Reactions (PCR) and Sanger DNA sequencing, we discovered CRISPRs in additional bacterial isolates of the same species of Bordetella. Transcriptional activity and processing of the CRISPR have been assessed via RT-PCR.
Here we describe a novel Type II-C CRISPR and its associated genes—cas1, cas2, and cas9—in several isolates of a newly discovered Bordetella species. The CRISPR-cas locus, which is absent in all other Bordetella species, has a significantly lower GC-content than the genome-wide average, suggesting acquisition of this locus via horizontal gene transfer from a currently unknown source. The CRISPR array is transcribed and processed into mature CRISPR RNAs (crRNA), some of which have homology to prophages found in closely related species B. hinzii.
Expression of the CRISPR-Cas system and processing of crRNAs with perfect homology to prophages present in closely related species, but absent in that containing this CRISPR-Cas system, suggest it provides protection against phage predation. The 3,117-bp cas9 endonuclease gene from this novel CRISPR-Cas system is 990 bp smaller than that of Streptococcus pyogenes, the 4,017-bp allele currently used for genome editing, and which may make it a useful tool in various CRISPR-Cas technologies.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-2028-9) contains supplementary material, which is available to authorized users.
Bordetella pseudohinzii; Type II CRISPR; Cas9; SpyCas9; Bacteria; Genome editing; Protospacer; GC-content; HGT
Background: The Cas6 protein is required for generating crRNAs in CRISPR-Cas I and III systems.
Results: The Cas6 protein is necessary for crRNA production but not sufficient for crRNA maintenance in Haloferax.
Conclusion: A Cascade-like complex is required in the type I-B system for a stable crRNA population.
Significance: The CRISPR-Cas system I-B has a similar Cascade complex like types I-A and I-E.
The clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR-Cas) system is a prokaryotic defense mechanism against foreign genetic elements. A plethora of CRISPR-Cas versions exist, with more than 40 different Cas protein families and several different molecular approaches to fight the invading DNA. One of the key players in the system is the CRISPR-derived RNA (crRNA), which directs the invader-degrading Cas protein complex to the invader. The CRISPR-Cas types I and III use the Cas6 protein to generate mature crRNAs. Here, we show that the Cas6 protein is necessary for crRNA production but that additional Cas proteins that form a CRISPR-associated complex for antiviral defense (Cascade)-like complex are needed for crRNA stability in the CRISPR-Cas type I-B system in Haloferax volcanii in vivo. Deletion of the cas6 gene results in the loss of mature crRNAs and interference. However, cells that have the complete cas gene cluster (cas1–8b) removed and are transformed with the cas6 gene are not able to produce and stably maintain mature crRNAs. crRNA production and stability is rescued only if cas5, -6, and -7 are present. Mutational analysis of the cas6 gene reveals three amino acids (His-41, Gly-256, and Gly-258) that are essential for pre-crRNA cleavage, whereas the mutation of two amino acids (Ser-115 and Ser-224) leads to an increase of crRNA amounts. This is the first systematic in vivo analysis of Cas6 protein variants. In addition, we show that the H. volcanii I-B system contains a Cascade-like complex with a Cas7, Cas5, and Cas6 core that protects the crRNA.
Archaea; Microbiology; Molecular Biology; Molecular Genetics; Protein Complexes; CRISPR/Cas; Cas6; Haloferax volcanii; crRNA; Type I-B
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
The adaptive immune system comprising CRISPR (clustered regularly interspaced short palindromic repeats) arrays and cas (CRISPR-associated) genes has been discovered in a wide range of bacteria and archaea and has recently attracted comprehensive investigations. However, the subtype I-B CRISPR-Cas system in haloarchaea has been less characterized. Here, we investigated Cas6-mediated RNA processing in Haloferax mediterranei. The Cas6 cleavage site, as well as the CRISPR transcription start site, was experimentally determined, and processing of CRISPR transcripts was detected with a progressively increasing pattern from early log to stationary phase. With genetic approaches, we discovered that the lack of Cas1, Cas3, or Cas4 unexpectedly resulted in a decrease of CRISPR transcripts, while Cas5, Cas6, and Cas7 were found to be essential in stabilizing mature CRISPR RNA (crRNA). Intriguingly, we observed a CRISPR- and Cas3-independent inhibition of a defective provirus, in which the putative Cascade (CRISPR-associated complex for antiviral defense) proteins (Cas5, Cas6, Cas7, and Cas8b) were indispensably required. A sequence carried by a proviral transcript was found to be homologous to the CRISPR repeat RNA and vulnerable to Cas6-mediated cleavage, implying a distinct interference mechanism that may account for this unusual inhibition. These results provide fundamental information for the subtype I-B CRISPR-Cas system in halophilic archaea and suggest diversified mechanisms and multiple physiological functions for the CRISPR-Cas system.
The human bacterial pathogen Listeria monocytogenes is emerging as a model organism to study RNA-mediated regulation in pathogenic bacteria. A class of non-coding RNAs called CRISPRs (clustered regularly interspaced short palindromic repeats) has been described to confer bacterial resistance against invading bacteriophages and conjugative plasmids. CRISPR function relies on the activity of CRISPR associated (cas) genes that encode a large family of proteins with nuclease or helicase activities and DNA and RNA binding domains. Here, we characterized a CRISPR element (RliB) that is expressed and processed in the L. monocytogenes strain EGD-e, which is completely devoid of cas genes. Structural probing revealed that RliB has an unexpected secondary structure comprising basepair interactions between the repeats and the adjacent spacers in place of canonical hairpins formed by the palindromic repeats. Moreover, in contrast to other CRISPR-Cas systems identified in Listeria, RliB-CRISPR is ubiquitously present among Listeria genomes at the same genomic locus and is never associated with the cas genes. We showed that RliB-CRISPR is a substrate for the endogenously encoded polynucleotide phosphorylase (PNPase) enzyme. The spacers of the different Listeria RliB-CRISPRs share many sequences with temperate and virulent phages. Furthermore, we show that a cas-less RliB-CRISPR lowers the acquisition frequency of a plasmid carrying the matching protospacer, provided that trans encoded cas genes of a second CRISPR-Cas system are present in the genome. Importantly, we show that PNPase is required for RliB-CRISPR mediated DNA interference. Altogether, our data reveal a yet undescribed CRISPR system whose both processing and activity depend on PNPase, highlighting a new and unexpected function for PNPase in “CRISPRology”.
CRISPR-Cas systems confer to bacteria and archaea an adaptive immunity that protects them against invading bacteriophages and plasmids. In this study, we characterize a CRISPR (RliB-CRISPR) that is present in all L. monocytogenes strains at the same genomic locus but is never associated with a cas operon. It is an unusual CRISPR that, as we demonstrate, has a secondary structure consisting of basepair interactions between the repeat sequence and the adjacent spacer. We show that the RliB-CRISPR is processed by the endogenously encoded polynucleotide phosphorylase enzyme (PNPase). In addition, we show that the RliB-CRISPR system requires PNPase and presence of trans encoded cas genes of a second CRISPR-Cas system, to mediate DNA interference directed against a plasmid carrying a matching protospacer. Altogether, our data reveal a novel type of CRISPR system in bacteria that requires endogenously encoded PNPase enzyme for its processing and interference activity.
CRISPR-Cas systems are RNA-guided immune systems that protect prokaryotes against viruses and other invaders. The CRISPR locus encodes crRNAs that recognize invading nucleic acid sequences and trigger silencing by the associated Cas proteins. There are multiple CRISPR-Cas systems with distinct compositions and mechanistic processes. Thermococcus kodakarensis (Tko) is a hyperthermophilic euryarchaeon that has both a Type I-A Csa and a Type I-B Cst CRISPR-Cas system. We have analyzed the expression and composition of crRNAs from the three CRISPRs in Tko by RNA deep sequencing and northern analysis. Our results indicate that crRNAs associated with these two CRISPR-Cas systems include an 8-nucleotide conserved sequence tag at the 5′ end. We challenged Tko with plasmid invaders containing sequences targeted by endogenous crRNAs and observed active CRISPR-Cas-mediated silencing. Plasmid silencing was dependent on complementarity with a crRNA as well as on a sequence element found immediately adjacent to the crRNA recognition site in the target termed the PAM (protospacer adjacent motif). Silencing occurred independently of the orientation of the target sequence in the plasmid, and appears to occur at the DNA level, presumably via DNA degradation. In addition, we have directed silencing of an invader plasmid by genetically engineering the chromosomal CRISPR locus to express customized crRNAs directed against the plasmid. Our results support CRISPR engineering as a feasible approach to develop prokaryotic strains that are resistant to infection for use in industry.
CRISPR; Cas; archaea; Thermococcus; hyperthermophile; immune; RNA; DNA; silencing; interference
CRISPR/Cas, bacterial and archaeal systems of interference with foreign genetic elements such as viruses or plasmids, consist of DNA loci called CRISPR cassettes (a set of variable spacers regularly separated by palindromic repeats) and associated cas genes. When a CRISPR spacer sequence exactly matches a sequence in a viral genome, the cell can become resistant to the virus. The CRISPR/Cas systems function through small RNAs originating from longer CRISPR cassette transcripts. While laboratory strains of Escherichia coli contain a functional CRISPR/Cas system (as judged by appearance of phage resistance at conditions of artificial co-overexpression of Cas genes and a CRISPR cassette engineered to target a λ phage), no natural phage resistance due to CRISPR system function was observed in this best-studied organism and no E. coli CRISPR spacer matches sequences of well-studied E. coli phages. To better understand the apparently “silent” E. coli CRISPR/Cas system, we systematically characterized processed transcripts from CRISPR cassettes. Using an engineered strain with genomically located spacer matching phage λ we show that endogenous levels of CRISPR cassette and cas genes expression allow only weak protection against infection with the phage. However, derepression of the CRISPR/Cas system by disruption of the hns gene leads to high level of protection.
Guide RNA molecules (crRNA) produced from clustered regularly interspaced short palindromic repeat (CRISPR) arrays, altogether with effector proteins (Cas) encoded by cognate cas (CRISPR associated) genes, mount an interference mechanism (CRISPR-Cas) that limits acquisition of foreign DNA in Bacteria and Archaea. The specificity of this action is provided by the repeat intervening spacer carried in the crRNA, which upon hybridization with complementary sequences enables their degradation by a Cas endonuclease. Moreover, CRISPR arrays are dynamic landscapes that may gain new spacers from infecting elements or lose them for example during genome replication. Thus, the spacer content of a strain determines the diversity of sequences that can be targeted by the corresponding CRISPR-Cas system reflecting its functionality. Most Escherichia coli strains possess either type I-E or I-F CRISPR-Cas systems. To evaluate their impact on the pathogenicity of the species, we inferred the pathotype and pathogenic potential of 126 strains of this and other closely related species and analyzed their repeat content. Our results revealed a negative correlation between the number of I-E CRISPR units in this system and the presence of pathogenicity traits: the median number of repeats was 2.5-fold higher for commensal isolates (with 29.5 units, range 0–53) than for pathogenic ones (12.0, range 0–42). Moreover, the higher the number of virulence factors within a strain, the lower the repeat content. Additionally, pathogenic strains of distinct ecological niches (i.e., intestinal or extraintestinal) differ in repeat counts. Altogether, these findings support an evolutionary connection between CRISPR and pathogenicity in E. coli.
Clustered, regularly interspaced short palindromic repeats (CRISPR) provide bacteria and archaea with sequence-specific, acquired defense against plasmids and phage. Because mobile elements constitute up to 25% of the genome of multidrug-resistant (MDR) enterococci, it was of interest to examine the codistribution of CRISPR and acquired antibiotic resistance in enterococcal lineages. A database was built from 16 Enterococcus faecalis draft genome sequences to identify commonalities and polymorphisms in the location and content of CRISPR loci. With this data set, we were able to detect identities between CRISPR spacers and sequences from mobile elements, including pheromone-responsive plasmids and phage, suggesting that CRISPR regulates the flux of these elements through the E. faecalis species. Based on conserved locations of CRISPR and CRISPR-cas loci and the discovery of a new CRISPR locus with associated functional genes, CRISPR3-cas, we screened additional E. faecalis strains for CRISPR content, including isolates predating the use of antibiotics. We found a highly significant inverse correlation between the presence of a CRISPR-cas locus and acquired antibiotic resistance in E. faecalis, and examination of an additional eight E. faecium genomes yielded similar results for that species. A mechanism for CRISPR-cas loss in E. faecalis was identified. The inverse relationship between CRISPR-cas and antibiotic resistance suggests that antibiotic use inadvertently selects for enterococcal strains with compromised genome defense.
For many bacteria, including the opportunistically pathogenic enterococci, antibiotic resistance is mediated by acquisition of new DNA and is frequently encoded on mobile DNA elements such as plasmids and transposons. Certain enterococcal lineages have recently emerged that are characterized by abundant mobile DNA, including numerous viruses (phage), and plasmids and transposons encoding multiple antibiotic resistances. These lineages cause hospital infection outbreaks around the world. The striking influx of mobile DNA into these lineages is in contrast to what would be expected if a self (genome)-defense system was present. Clustered, regularly interspaced short palindromic repeat (CRISPR) defense is a recently discovered mechanism of prokaryotic self-defense that provides a type of acquired immunity. Here, we find that antibiotic resistance and possession of complete CRISPR loci are inversely related and that members of recently emerged high-risk enterococcal lineages lack complete CRISPR loci. Our results suggest that antibiotic therapy inadvertently selects for enterococci with compromised genome defense.
CRISPR-Cas is an adaptive prokaryotic immune system, providing protection against viruses and other mobile genetic elements. In type I and type III CRISPR-Cas systems, CRISPR RNA (crRNA) is generated by cleavage of a primary transcript by the Cas6 endonuclease and loaded into multisubunit surveillance/effector complexes, allowing homology-directed detection and cleavage of invading elements. Highly studied CRISPR-Cas systems such as those in Escherichia coli and Pseudomonas aeruginosa have a single Cas6 enzyme that is an integral subunit of the surveillance complex. By contrast, Sulfolobus solfataricus has a complex CRISPR-Cas system with three types of surveillance complexes (Cascade/type I-A, CSM/type III-A and CMR/type III-B), five Cas6 paralogues and two different CRISPR-repeat families (AB and CD). Here, we investigate the kinetic properties of two different Cas6 paralogues from S. solfataricus. The Cas6-1 subtype is specific for CD-family CRISPR repeats, generating crRNA by multiple turnover catalysis whilst Cas6-3 has a broader specificity and also processes a non-coding RNA with a CRISPR repeat-related sequence. Deep sequencing of crRNA in surveillance complexes reveals a biased distribution of spacers derived from AB and CD loci, suggesting functional coupling between Cas6 paralogues and their downstream effector complexes.
Well-studied innate immune systems exist throughout bacteria and archaea, but a more recently discovered genomic locus may offer prokaryotes surprising immunological adaptability. Mediated by a cassette-like genomic locus termed Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), the microbial adaptive immune system differs from its eukaryotic immune analogues by incorporating new immunities unidirectionally. CRISPR thus stores genomically recoverable timelines of virus-host coevolution in natural organisms refractory to laboratory cultivation. Here we combined a population genetic mathematical model of CRISPR-virus coevolution with six years of metagenomic sequencing to link the recoverable genomic dynamics of CRISPR loci to the unknown population dynamics of virus and host in natural communities. Metagenomic reconstructions in an acid-mine drainage system document CRISPR loci conserving ancestral immune elements to the base-pair across thousands of microbial generations. This ‘trailer-end conservation’ occurs despite rapid viral mutation and despite rapid prokaryotic genomic deletion. The trailer-ends of many reconstructed CRISPR loci are also largely identical across a population. ‘Trailer-end clonality’ occurs despite predictions of host immunological diversity due to negative frequency dependent selection (kill the winner dynamics). Statistical clustering and model simulations explain this lack of diversity by capturing rapid selective sweeps by highly immune CRISPR lineages. Potentially explaining ‘trailer-end conservation,’ we record the first example of a viral bloom overwhelming a CRISPR system. The polyclonal viruses bloom even though they share sequences previously targeted by host CRISPR loci. Simulations show how increasing random genomic deletions in CRISPR loci purges immunological controls on long-lived viral sequences, allowing polyclonal viruses to bloom and depressing host fitness. Our results thus link documented patterns of genomic conservation in CRISPR loci to an evolutionary advantage against persistent viruses. By maintaining old immunities, selection may be tuning CRISPR-mediated immunity against viruses reemerging from lysogeny or migration.
Most microbes appear unculturable in the laboratory, limiting our knowledge of how virus and prokaryotic host evolve in natural systems. However, a genomic locus found in many prokaryotes, CRISPR, may offer cultivation-independent probes of virus-microbe coevolution. Utilizing nearby genes, CRISPR can serially incorporate short viral and plasmid sequences. These sequences bind and cleave cognate regions in subsequent viral and plasmid insertions, conferring adaptive anti-viral and anti-plasmid immunity. By incorporating sequences undirectionally, CRISPR also provides timelines of virus-prokaryote coevolution. Yet, CRISPR only incorporates 30–80 base-pair viral sequences, leaving incomplete coevolutionary recordings. To reconstruct the missing coevolutionary dynamics shaping natural CRISPRs, we combined metagenomic reconstructions with population-scale mathematical modeling. Capturing rare and rapid sweeps of CRISPR diversity by highly immune lines, mathematical modeling explains why naturally reconstructed CRISPR loci are often largely identical across a population. Both model and experiment further document surprising proliferations of old viral sequences against which hosts had preexisting CRISPR immunity. Due to these deadly blooms of ancestral viral elements, CRISPR's conservation of old immune sequences appears to confer a selective advantage. This may explain the striking immunological memory documented in CRISPR loci, which occurs despite rapid viral mutation and despite rapid deletions in prokaryotic genomes.
Background: CRISPR/Cas systems allow archaea and bacteria to resist invasion by foreign nucleic acids.
Results: The CRISPR/Cas system in Haloferax recognized six different PAM sequences that could trigger a defense response.
Conclusion: The PAM sequence specificity of the defense response in type I CRISPR systems is more relaxed than previously thought.
Significance: The PAM sequence requirements for interference and adaptation appear to differ markedly.
The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated (Cas) system provides adaptive and heritable immunity against foreign genetic elements in most archaea and many bacteria. Although this system is widespread and diverse with many subtypes, only a few species have been investigated to elucidate the precise mechanisms for the defense of viruses or plasmids. Approximately 90% of all sequenced archaea encode CRISPR/Cas systems, but their molecular details have so far only been examined in three archaeal species: Sulfolobus solfataricus, Sulfolobus islandicus, and Pyrococcus furiosus. Here, we analyzed the CRISPR/Cas system of Haloferax volcanii using a plasmid-based invader assay. Haloferax encodes a type I-B CRISPR/Cas system with eight Cas proteins and three CRISPR loci for which the identity of protospacer adjacent motifs (PAMs) was unknown until now. We identified six different PAM sequences that are required upstream of the protospacer to permit target DNA recognition. This is only the second archaeon for which PAM sequences have been determined, and the first CRISPR group with such a high number of PAM sequences. Cells could survive the plasmid challenge if their CRISPR/Cas system was altered or defective, e.g. by deletion of the cas gene cassette. Experimental PAM data were supplemented with bioinformatics data on Haloferax and Haloquadratum.
Archaea; Microbiology; RNA; RNA Metabolism; RNA Processing; CRISPR/Cas; Haloferax volcanii; PAM
Riemerella anatipestifer infection is a contagious disease that has resulted in major economic losses in the duck industry worldwide. This study attempted to characterize CRISPR-Cas systems in the disease-causing agent, Riemerella anatipestifer (R. anatipestifer). The CRISPR-Cas system provides adaptive immunity against foreign genetic elements in prokaryotes and CRISPR-cas loci extensively exist in the genomes of archaea and bacteria. However, the structure characteristics of R. anatipestifer CRISPR-Cas systems remains to be elucidated due to the limited availability of genomic data.
To identify the structure and components associated with CRISPR-Cas systems in R. anatipestifer, we performed comparative genomic analysis of CRISPR-Cas systems in 25 R. anatipestifer strains using high-throughput sequencing. The results showed that most of the R. anatipestifer strains (20/25) that were analyzed have two CRISPR loci (CRISPR1 and CRISPR2). CRISPR1 was shown to be flanked on one side by cas genes, while CRISPR2 was designated as an orphan. The other analyzed strains harbored only one locus, either CRISPR1 or CRISPR2. The length and content of consensus direct repeat sequences, as well as the length of spacer sequences associated with the two loci, differed from each other. Only three cas genes (cas1, cas2 and cas9) were located upstream of CRISPR1. CRISPR1 was also shown to be flanked by a 107 bp-long putative leader sequence and a 16 nt-long anti-repeat sequence. Combined with analysis of spacer organization similarity and phylogenetic tree of the R. anatipestifer strains, CRISPR arrays can be divided into different subgroups. The diversity of spacer organization was observed in the same subgroup. In general, spacer organization in CRISPR1 was more divergent than that in CRISPR2. Additionally, only 8 % of spacers (13/153) were homologous with phage or plasmid sequences. The cas operon flanking CRISPR1 was observed to be relatively conserved based on multiple sequence alignments of Cas amino acid sequences. The phylogenetic analysis associated with Cas9 showed Cas9 sequence from R. anatipestifer was closely related to that of Bacteroides fragilis and formed part of the subtype II-C subcluster.
Our data revealed for the first time the structural features of R. anatipestifer CRISPR-Cas systems. The illumination of structural features of CRISPR-Cas system may assist in studying the specific mechanism associated with CRISPR-mediated adaptive immunity and other biological functions in R. anatipestifer.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-016-3040-4) contains supplementary material, which is available to authorized users.
Riemerella anatipestifer; CRISPR-Cas system; cas gene; repeat sequence; spacer sequence; phylogenetic analysis