The adaptive prokaryotic immune system CRISPR-Cas provides RNA-mediated protection from invading genetic elements. The fundamental basis of the system is the ability to capture small pieces of foreign DNA for incorporation into the genome at the CRISPR locus, a process known as Adaptation, which is dependent on the Cas1 and Cas2 proteins. We demonstrate that Cas1 catalyses an efficient trans-esterification reaction on branched DNA substrates, which represents the reverse- or disintegration reaction. Cas1 from both Escherichia coli and Sulfolobus solfataricus display sequence specific activity, with a clear preference for the nucleotides flanking the integration site at the leader-repeat 1 boundary of the CRISPR locus. Cas2 is not required for this activity and does not influence the specificity. This suggests that the inherent sequence specificity of Cas1 is a major determinant of the adaptation process.
In most animals, the adaptive immune system creates specialized cells that adapt to efficiently fight off any viruses or other pathogens that have invaded. Bacteria (and another group of single-celled organisms called archaea) also have an adaptive immune system, known as CRISPR-Cas, that combats viral invaders. This system is based on sections of the microbes' DNA called CRISPRs, which contain repetitive DNA sequences that are separated by short segments of ‘spacer’ DNA. When a virus invades the cell, some viral DNA is incorporated into the CRISPR as a spacer. This process is known as adaptation. CRISPR-associated proteins (or ‘Cas’ proteins) then use this spacer to recognize and mount an attack on any matching invader DNA that is later encountered.
Exactly how a spacer is inserted into the correct position in the CRISPR array during adaptation remains poorly understood. However, it is known that two CRISPR proteins called Cas1 and Cas2 play essential roles in this process.
Rollie et al. took Cas1 proteins from a bacterial cell (Escherichia coli) and an archaeal species (Sulfolobus solfataricus) and added them to branched DNA structures in the laboratory. These experiments revealed that Cas1 from both organisms can break the DNA down into smaller pieces. Cas2, on the other hand, is not required for this process. This ‘disintegration’ reaction is the reverse process of the ‘integration’ step of adaptation where the CRISPR proteins insert the invader DNA into the CRISPR array.
Rollie et al. also found that the disintegration reaction performed by Cas1 takes place on specific DNA sequences, which are also the sites where Cas1 inserts the spacer DNA during adaptation. Therefore, by examining the disintegration reaction, many of the details of the integration step can be deduced.
Overall, Rollie et al. show that selection by Cas1 plays an important role in restricting the adaptation process to particular DNA sites. The next step will be to use the disintegration reaction to examine the DNA binding and manipulation steps performed by Cas1 as part of its role in the adaptation of the CRISPR system.
Sulfolobus solfataricus; CRISPR; integrase; adaptation; E. coli; other
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism.
CRISPR-spacer acquisition; Cascade; Escherichia coli K12; O157:H7; RNA-guided immunity; cas genes; protospacer adjacent motif; reporter plasmids; ruler mechanism; spacer orientation
Discriminating self and non-self is a universal requirement of immune systems. Adaptive immune systems in prokaryotes are centered around repetitive loci called CRISPRs (clustered regularly interspaced short palindromic repeat), into which invader DNA fragments are incorporated. CRISPR transcripts are processed into small RNAs that guide CRISPR-associated (Cas) proteins to invading nucleic acids by complementary base pairing. However, to avoid autoimmunity it is essential that these RNA-guides exclusively target invading DNA and not complementary DNA sequences (i.e., self-sequences) located in the host's own CRISPR locus. Previous work on the Type III-A CRISPR system from Staphylococcus epidermidis has demonstrated that a portion of the CRISPR RNA-guide sequence is involved in self versus non-self discrimination. This self-avoidance mechanism relies on sensing base pairing between the RNA-guide and sequences flanking the target DNA. To determine if the RNA-guide participates in self versus non-self discrimination in the Type I-E system from Escherichia coli we altered base pairing potential between the RNA-guide and the flanks of DNA targets. Here we demonstrate that Type I-E systems discriminate self from non-self through a base pairing-independent mechanism that strictly relies on the recognition of four unchangeable PAM sequences. In addition, this work reveals that the first base pair between the guide RNA and the PAM nucleotide immediately flanking the target sequence can be disrupted without affecting the interference phenotype. Remarkably, this indicates that base pairing at this position is not involved in foreign DNA recognition. Results in this paper reveal that the Type I-E mechanism of avoiding self sequences and preventing autoimmunity is fundamentally different from that employed by Type III-A systems. We propose the exclusive targeting of PAM-flanked sequences to be termed a target versus non-target discrimination mechanism.
CRISPR loci and their associated genes form a diverse set of adaptive immune systems that are widespread among prokaryotes. In these systems, the CRISPR-associated genes (cas) encode for proteins that capture fragments of invading DNA and integrate these sequences between repeat sequences of the host's CRISPR locus. This information is used upon re-infection to degrade invader genomes. Storing invader sequences in host genomes necessitates a mechanism to differentiate between invader sequences on invader genomes and invader sequences on the host genome. CRISPR-Cas of Staphylococcus epidermidis (Type III-A system) is inhibited when invader sequences are flanked by repeat sequences, and this prevents targeting of the CRISPR locus on the host genome. Here we demonstrate that Escherichia coli CRISPR-Cas (Type I-E system) is not inhibited by repeat sequences. Instead, this system is specifically activated by the presence of bona fide Protospacer Adjacent Motifs (PAMs) in the target. PAMs are conserved sequences adjoining invader sequences on the invader genome, and these sequences are never adjacent to invader sequences within host CRISPR loci. PAM recognition is not affected by base pairing potential of the target with the crRNA. As such, the Type I-E system lacks the ability to specifically recognize self DNA.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas–mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA–targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.
Bacteria have evolved mechanisms that provide protection from continual invasion by viruses and other foreign elements. Resistance systems, known as CRISPR/Cas, were recently discovered and equip bacteria and archaea with an “adaptive immune system.” This adaptive immunity provides a highly evolvable sequence-specific small RNA–based memory of past invasions by viruses and foreign genetic elements. There are many cases where these systems appear to target regions within the bacterial host's own genome (a possible autoimmunity), but the evolutionary rationale for this is unclear. Here, we demonstrate that CRISPR/Cas targeting of the host chromosome is highly toxic but that cells survive through mutations that alleviate the immune mechanism. We have used this phenotype to gain insight into how these systems function and show that large changes in the bacterial genome can occur. For example, targeting of a chromosomal pathogenicity island, important for virulence of the potato pathogen Pectobacterium atrosepticum, resulted in deletion of the island, which constituted ∼2% of the bacterial genome. These results have broad significance for the role of CRISPR/Cas systems and their impact on the evolution of bacterial genomes and virulence. In addition, this study demonstrates their potential as a tool for the targeted deletion of specific regions of bacterial chromosomes.
The adaptation against foreign nucleic acids by the CRISPR–Cas system (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated proteins) depends on the insertion of foreign nucleic acid-derived sequences into the CRISPR array as novel spacers by still unknown mechanism. We identified and characterized in Escherichia coli intermediate states of spacer integration and mapped the integration site at the chromosomal CRISPR array in vivo. The results show that the insertion of new spacers occurs by site-specific nicking at both strands of the leader proximal repeat in a staggered way and is accompanied by joining of the resulting 5′-ends of the repeat strands with the 3′-ends of the incoming spacer. This concerted cleavage-ligation reaction depends on the metal-binding center of Cas1 protein and requires the presence of Cas2. By acquisition assays using plasmid-located CRISPR array with mutated repeat sequences, we demonstrate that the primary sequence of the first repeat is crucial for cleavage of the CRISPR array and the ligation of new spacer DNA.
The human bacterial pathogen Listeria monocytogenes is emerging as a model organism to study RNA-mediated regulation in pathogenic bacteria. A class of non-coding RNAs called CRISPRs (clustered regularly interspaced short palindromic repeats) has been described to confer bacterial resistance against invading bacteriophages and conjugative plasmids. CRISPR function relies on the activity of CRISPR associated (cas) genes that encode a large family of proteins with nuclease or helicase activities and DNA and RNA binding domains. Here, we characterized a CRISPR element (RliB) that is expressed and processed in the L. monocytogenes strain EGD-e, which is completely devoid of cas genes. Structural probing revealed that RliB has an unexpected secondary structure comprising basepair interactions between the repeats and the adjacent spacers in place of canonical hairpins formed by the palindromic repeats. Moreover, in contrast to other CRISPR-Cas systems identified in Listeria, RliB-CRISPR is ubiquitously present among Listeria genomes at the same genomic locus and is never associated with the cas genes. We showed that RliB-CRISPR is a substrate for the endogenously encoded polynucleotide phosphorylase (PNPase) enzyme. The spacers of the different Listeria RliB-CRISPRs share many sequences with temperate and virulent phages. Furthermore, we show that a cas-less RliB-CRISPR lowers the acquisition frequency of a plasmid carrying the matching protospacer, provided that trans encoded cas genes of a second CRISPR-Cas system are present in the genome. Importantly, we show that PNPase is required for RliB-CRISPR mediated DNA interference. Altogether, our data reveal a yet undescribed CRISPR system whose both processing and activity depend on PNPase, highlighting a new and unexpected function for PNPase in “CRISPRology”.
CRISPR-Cas systems confer to bacteria and archaea an adaptive immunity that protects them against invading bacteriophages and plasmids. In this study, we characterize a CRISPR (RliB-CRISPR) that is present in all L. monocytogenes strains at the same genomic locus but is never associated with a cas operon. It is an unusual CRISPR that, as we demonstrate, has a secondary structure consisting of basepair interactions between the repeat sequence and the adjacent spacer. We show that the RliB-CRISPR is processed by the endogenously encoded polynucleotide phosphorylase enzyme (PNPase). In addition, we show that the RliB-CRISPR system requires PNPase and presence of trans encoded cas genes of a second CRISPR-Cas system, to mediate DNA interference directed against a plasmid carrying a matching protospacer. Altogether, our data reveal a novel type of CRISPR system in bacteria that requires endogenously encoded PNPase enzyme for its processing and interference activity.
Bacteria and archaea face continual onslaughts of rapidly diversifying viruses and plasmids. Many prokaryotes maintain adaptive immune systems known as clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (Cas). CRISPR-Cas systems are genomic sensors that serially acquire viral and plasmid DNA fragments (spacers) that are utilized to target and cleave matching viral and plasmid DNA in subsequent genomic invasions, offering critical immunological memory. Only 50% of sequenced bacteria possess CRISPR-Cas immunity, in contrast to over 90% of sequenced archaea. To probe why half of bacteria lack CRISPR-Cas immunity, we combined comparative genomics and mathematical modeling. Analysis of hundreds of diverse prokaryotic genomes shows that CRISPR-Cas systems are substantially more prevalent in thermophiles than in mesophiles. With sequenced bacteria disproportionately mesophilic and sequenced archaea mostly thermophilic, the presence of CRISPR-Cas appears to depend more on environmental temperature than on bacterial-archaeal taxonomy. Mutation rates are typically severalfold higher in mesophilic prokaryotes than in thermophilic prokaryotes. To quantitatively test whether accelerated viral mutation leads microbes to lose CRISPR-Cas systems, we developed a stochastic model of virus-CRISPR coevolution. The model competes CRISPR-Cas-positive (CRISPR-Cas+) prokaryotes against CRISPR-Cas-negative (CRISPR-Cas−) prokaryotes, continually weighing the antiviral benefits conferred by CRISPR-Cas immunity against its fitness costs. Tracking this cost-benefit analysis across parameter space reveals viral mutation rate thresholds beyond which CRISPR-Cas cannot provide sufficient immunity and is purged from host populations. These results offer a simple, testable viral diversity hypothesis to explain why mesophilic bacteria disproportionately lack CRISPR-Cas immunity. More generally, fundamental limits on the adaptability of biological sensors (Lamarckian evolution) are predicted.
A remarkable recent discovery in microbiology is that bacteria and archaea possess systems conferring immunological memory and adaptive immunity. Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (CRISPR-Cas) are genomic sensors that allow prokaryotes to acquire DNA fragments from invading viruses and plasmids. Providing immunological memory, these stored fragments destroy matching DNA in future viral and plasmid invasions. CRISPR-Cas systems also provide adaptive immunity, keeping up with mutating viruses and plasmids by continually acquiring new DNA fragments. Surprisingly, less than 50% of mesophilic bacteria, in contrast to almost 90% of thermophilic bacteria and Archaea, maintain CRISPR-Cas immunity. Using mathematical modeling, we probe this dichotomy, showing how increased viral mutation rates can explain the reduced prevalence of CRISPR-Cas systems in mesophiles. Rapidly mutating viruses outrun CRISPR-Cas immune systems, likely decreasing their prevalence in bacterial populations. Thus, viral adaptability may select against, rather than for, immune adaptability in prokaryotes.
CRISPR/Cas is a widespread adaptive immune system in prokaryotes. This system integrates short stretches of DNA derived from invading nucleic acids into genomic CRISPR loci, which function as memory of previously encountered invaders. In Escherichia coli, transcripts of these loci are cleaved into small RNAs and utilized by the Cascade complex to bind invader DNA, which is then likely degraded by Cas3 during CRISPR interference.
We describe how a CRISPR-activated E. coli K12 is cured from a high copy number plasmid under non-selective conditions in a CRISPR-mediated way. Cured clones integrated at least one up to five anti-plasmid spacers in genomic CRISPR loci. New spacers are integrated directly downstream of the leader sequence. The spacers are non-randomly selected to target protospacers with an AAG protospacer adjacent motif, which is located directly upstream of the protospacer. A co-occurrence of PAM deviations and CRISPR repeat mutations was observed, indicating that one nucleotide from the PAM is incorporated as the last nucleotide of the repeat during integration of a new spacer. When multiple spacers were integrated in a single clone, all spacer targeted the same strand of the plasmid, implying that CRISPR interference caused by the first integrated spacer directs subsequent spacer acquisition events in a strand specific manner.
The E. coli Type I-E CRISPR/Cas system provides resistance against bacteriophage infection, but also enables removal of residing plasmids. We established that there is a positive feedback loop between active spacers in a cluster – in our case the first acquired spacer - and spacers acquired thereafter, possibly through the use of specific DNA degradation products of the CRISPR interference machinery by the CRISPR adaptation machinery. This loop enables a rapid expansion of the spacer repertoire against an actively present DNA element that is already targeted, amplifying the CRISPR interference effect.
CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100–500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems.
Clostridium difficile is the cause of most frequently occurring nosocomial diarrhea worldwide. As an enteropathogen, C. difficile must be exposed to multiple exogenous genetic elements in bacteriophage-rich gut communities. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems allow bacteria to adapt to foreign genetic invaders. Our recent data revealed active expression and processing of CRISPR RNAs from multiple type I-B CRISPR arrays in C. difficile reference strain 630. Here, we demonstrate active expression of CRISPR arrays in strain R20291, an epidemic C. difficile strain. Through genome sequencing and host range analysis of several new C. difficile phages and plasmid conjugation experiments, we provide evidence of defensive function of the CRISPR-Cas system in both C. difficile strains. We further demonstrate that C. difficile Cas proteins are capable of interference in a heterologous host, Escherichia coli. These data set the stage for mechanistic and physiological analyses of CRISPR-Cas-mediated interactions of important global human pathogen with its genetic parasites.
Clostridium difficile is the major cause of nosocomial infections associated with antibiotic therapy worldwide. To survive in bacteriophage-rich gut communities, enteropathogens must develop efficient systems for defense against foreign DNA elements. CRISPR-Cas systems have recently taken center stage among various anti-invader bacterial defense systems. We provide experimental evidence for the function of the C. difficile CRISPR system against plasmid DNA and bacteriophages. These data demonstrate the original features of active C. difficile CRISPR system and bring important insights into the interactions of this major enteropathogen with foreign DNA invaders during its infection cycle.
CRISPR (clustered regularly interspaced short palindromic repeats) is a microbial immune system against foreign DNA. Recognition sequences (spacers) encoded within the CRISPR array mediate the immune reaction in a sequence-specific manner. The known mechanisms for the evolution of CRISPR arrays include spacer acquisition from foreign DNA elements at the time of invasion and array erosion through spacer deletion. Here, we consider the contribution of genetic recombination between homologous CRISPR arrays to the evolution of spacer repertoire. Acquisition of spacers from exogenic arrays via recombination may confer the recipient with immunity against unencountered antagonists. For this purpose, we develop a novel method for the detection of recombination in CRISPR arrays by modeling the spacer order in arrays from multiple strains from the same species. Because the evolutionary signal of spacer recombination may be similar to that of pervasive spacer deletions or independent spacer acquisition, our method entails a robustness analysis of the recombination inference by a statistical comparison to resampled and perturbed data sets. We analyze CRISPR data sets from four bacterial species: two Gammaproteobacteria species harboring CRISPR type I and two Streptococcus species harboring CRISPR type II loci. We find that CRISPR array evolution in Escherichia coli and Streptococcus agalactiae can be explained solely by vertical inheritance and differential spacer deletion. In Pseudomonas aeruginosa, we find an excess of single spacers potentially incorporated into the CRISPR locus during independent acquisition events. In Streptococcus thermophilus, evidence for spacer acquisition by recombination is present in 5 out of 70 strains. Genetic recombination has been proposed to accelerate adaptation by combining beneficial mutations that arose in independent lineages. However, for most species under study, we find that CRISPR evolution is shaped mainly by spacer acquisition and loss rather than recombination. Since the evolution of spacer content is characterized by a rapid turnover, it is likely that recombination is not beneficial for improving phage resistance in the strains under study, or that it cannot be detected in the resolution of intraspecies comparisons.
evolutionary microbiology; lateral gene transfer; bacterial genomics
Well-studied innate immune systems exist throughout bacteria and archaea, but a more recently discovered genomic locus may offer prokaryotes surprising immunological adaptability. Mediated by a cassette-like genomic locus termed Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), the microbial adaptive immune system differs from its eukaryotic immune analogues by incorporating new immunities unidirectionally. CRISPR thus stores genomically recoverable timelines of virus-host coevolution in natural organisms refractory to laboratory cultivation. Here we combined a population genetic mathematical model of CRISPR-virus coevolution with six years of metagenomic sequencing to link the recoverable genomic dynamics of CRISPR loci to the unknown population dynamics of virus and host in natural communities. Metagenomic reconstructions in an acid-mine drainage system document CRISPR loci conserving ancestral immune elements to the base-pair across thousands of microbial generations. This ‘trailer-end conservation’ occurs despite rapid viral mutation and despite rapid prokaryotic genomic deletion. The trailer-ends of many reconstructed CRISPR loci are also largely identical across a population. ‘Trailer-end clonality’ occurs despite predictions of host immunological diversity due to negative frequency dependent selection (kill the winner dynamics). Statistical clustering and model simulations explain this lack of diversity by capturing rapid selective sweeps by highly immune CRISPR lineages. Potentially explaining ‘trailer-end conservation,’ we record the first example of a viral bloom overwhelming a CRISPR system. The polyclonal viruses bloom even though they share sequences previously targeted by host CRISPR loci. Simulations show how increasing random genomic deletions in CRISPR loci purges immunological controls on long-lived viral sequences, allowing polyclonal viruses to bloom and depressing host fitness. Our results thus link documented patterns of genomic conservation in CRISPR loci to an evolutionary advantage against persistent viruses. By maintaining old immunities, selection may be tuning CRISPR-mediated immunity against viruses reemerging from lysogeny or migration.
Most microbes appear unculturable in the laboratory, limiting our knowledge of how virus and prokaryotic host evolve in natural systems. However, a genomic locus found in many prokaryotes, CRISPR, may offer cultivation-independent probes of virus-microbe coevolution. Utilizing nearby genes, CRISPR can serially incorporate short viral and plasmid sequences. These sequences bind and cleave cognate regions in subsequent viral and plasmid insertions, conferring adaptive anti-viral and anti-plasmid immunity. By incorporating sequences undirectionally, CRISPR also provides timelines of virus-prokaryote coevolution. Yet, CRISPR only incorporates 30–80 base-pair viral sequences, leaving incomplete coevolutionary recordings. To reconstruct the missing coevolutionary dynamics shaping natural CRISPRs, we combined metagenomic reconstructions with population-scale mathematical modeling. Capturing rare and rapid sweeps of CRISPR diversity by highly immune lines, mathematical modeling explains why naturally reconstructed CRISPR loci are often largely identical across a population. Both model and experiment further document surprising proliferations of old viral sequences against which hosts had preexisting CRISPR immunity. Due to these deadly blooms of ancestral viral elements, CRISPR's conservation of old immune sequences appears to confer a selective advantage. This may explain the striking immunological memory documented in CRISPR loci, which occurs despite rapid viral mutation and despite rapid deletions in prokaryotic genomes.
Clustered regularly interspaced short palindromic repeats (CRISPR), in combination with CRISPR associated (cas) genes, constitute CRISPR-Cas bacterial adaptive immune systems. To generate immunity, these systems acquire short sequences of nucleic acids from foreign invaders and incorporate these into their CRISPR arrays as spacers. This adaptation process is the least characterized step in CRISPR-Cas immunity. Here, we used Pectobacterium atrosepticum to investigate adaptation in Type I-F CRISPR-Cas systems. Pre-existing spacers that matched plasmids stimulated hyperactive primed acquisition and resulted in the incorporation of up to nine new spacers across all three native CRISPR arrays. Endogenous expression of the cas genes was sufficient, yet required, for priming. The new spacers inhibited conjugation and transformation, and interference was enhanced with increasing numbers of new spacers. We analyzed ∼350 new spacers acquired in priming events and identified a 5′-protospacer-GG-3′ protospacer adjacent motif. In contrast to priming in Type I-E systems, new spacers matched either plasmid strand and a biased distribution, including clustering near the primed protospacer, suggested a bi-directional translocation model for the Cas1:Cas2–3 adaptation machinery. Taken together these results indicate priming adaptation occurs in different CRISPR-Cas systems, that it can be highly active in wild-type strains and that the underlying mechanisms vary.
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1)1-3. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12 .
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4.
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology; Issue 67; CRISPR/Cas; endonuclease; in vitro transcription; crRNA; Cas6
Central to the disparate adaptive immune systems of archaea and bacteria are clustered regularly interspaced short palindromic repeats (CRISPR). The spacer regions derive from invading genetic elements and, via RNA intermediates and associated proteins, target and cleave nucleic acids of the invader. Here we demonstrate the hyperactive uptake of hundreds of unique spacers within CRISPR loci associated with type I and IIIB immune systems of a hyperthermophilic archaeon. Infection with an environmental virus mixture resulted in the exclusive uptake of protospacers from a co-infecting putative conjugative plasmid. Spacer uptake occurred by two distinct mechanisms in only one of two CRISPR loci subfamilies present. In two loci, insertions, often multiple, occurred adjacent to the leader while in a third locus single spacers were incorporated throughout the array. Protospacer DNAs were excised from the invading genetic element immediately after CCN motifs, on either strand, with the secondary cut apparently produced by a ruler mechanism. Over a 10-week period, there was a gradual decrease in the number of wild-type cells present in the culture but the virus and putative conjugative plasmid were still propagating. The results underline the complex dynamics of CRISPR-based immune systems within a population infected with genetic elements.
Clustered, regularly interspaced, short palindromic repeats (CRISPR) loci and their associated genes (cas) confer bacteria and archaea with adaptive immunity against phages and other invading genetic elements. A fundamental requirement of any immune system is the ability to build a memory of past infections in order to deal more efficiently with recurrent infections. The adaptive feature of CRISPR-Cas immune systems relies on their ability to memorize DNA sequences of invading molecules and integrate them in between the repetitive sequences of the CRISPR array in the form of “spacers”. The transcription of a spacer generates a small antisense RNA that is used by RNA-guided Cas nucleases to cleave the invading nucleic acid in order to protect the cell from infection. The acquisition of new spacers allows the CRISPR-Cas immune system to rapidly adapt against new threats and is therefore termed “adaptation”. Recent studies have begun to elucidate the genetic requirements for adaptation and have demonstrated that rather than being a stochastic process, the selection of new spacers is influenced by several factors. We review here our current knowledge of the CRISPR adaptation mechanism.
CRISPR/Cas systems; spacer acquisition; adaptation; bacteriophage; adaptive immunity; horizontal gene transfer
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
The family of clustered regularly interspaced short palindromic repeats (CRISPRs) describes a class of DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. The DNA repeats do not encode proteins, but appear to be transcribed and processed into small RNAs that may have any number of functions, including resistance to any phage (i.e., virus of bacteria) whose sequence matches a spacer; spacers change rapidly as microbial strains evolve. This work describes 41 new CRISPR-associated (cas) gene families, which are always found near these repeats, in addition to the four previously known. It shows that CRISPR systems belong to different classes, with different repeat patterns, sets of genes, and species ranges. Most of these seem to come and go rather rapidly from their host genomes. These possibly beneficial mobile genetic elements may play an important role in driving prokaryotic evolution.
The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution.
CRISPR; Cas; chromosomal targeting; bacterial evolution; genomic islands; plasmids; horizontal gene transfer; bacteriophages; integrative and conjugative elements
Prokaryotes thrive in spite of the vast number and diversity of their viruses. This partly results from the evolution of mechanisms to inactivate or silence the action of exogenous DNA. Among these, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are unique in providing adaptive immunity against elements with high local resemblance to genomes of previously infecting agents. Here, we analyze the CRISPR loci of 51 complete genomes of Escherichia and Salmonella. CRISPR are in two pairs of loci in Escherichia, one single pair in Salmonella, each pair showing a similar turnover rate, repeat sequence and putative linkage to a common set of cas genes. Yet, phylogeny shows that CRISPR and associated cas genes have different evolutionary histories, the latter being frequently exchanged or lost. In our set, one CRISPR pair seems specialized in plasmids often matching genes coding for the replication, conjugation and antirestriction machinery. Strikingly, this pair also matches the cognate cas genes in which case these genes are absent. The unexpectedly high conservation of this anti-CRISPR suggests selection to counteract the invasion of mobile elements containing functional CRISPR/cas systems. There are few spacers in most CRISPR, which rarely match genomes of known phages. Furthermore, we found that strains divergent less than 250 thousand years ago show virtually identical CRISPR. The lack of congruence between cas, CRISPR and the species phylogeny and the slow pace of CRISPR change make CRISPR poor epidemiological markers in enterobacteria. All these observations are at odds with the expectedly abundant and dynamic repertoire of spacers in an immune system aiming at protecting bacteria from phages. Since we observe purifying selection for the maintenance of CRISPR these results suggest that alternative evolutionary roles for CRISPR remain to be uncovered.
The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system provides prokaryotic cells with an adaptive and heritable immune response to foreign genetic elements, such as viruses, plasmids, and transposons. It is present in the majority of Archaea and almost half of species of Bacteria. Porphyromonas gingivalis is an important human pathogen that has been proven to be an etiological agent of periodontitis and has been linked to systemic conditions, such as rheumatoid arthritis and cardiovascular disease. At least 95% of clinical strains of P. gingivalis carry CRISPR arrays, suggesting that these arrays play an important function in vivo. Here we show that all four CRISPR arrays present in the P. gingivalis W83 genome are transcribed. For one of the arrays, we demonstrate in vivo activity against double-stranded DNA constructs containing protospacer sequences accompanied at the 3′ end by an NGG protospacer-adjacent motif (PAM). Most of the 44 spacers present in the genome of P. gingivalis W83 share no significant similarity with any known sequences, although 4 spacers are similar to sequences from bacteria found in the oral cavity and the gastrointestinal tract. Four spacers match genomic sequences of the host; however, none of these is flanked at its 3′ terminus by the appropriate PAM element.
IMPORTANCE The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system is a unique system that provides prokaryotic cells with an adaptive and heritable immunity. In this report, we show that the CRISPR-Cas system of P. gingivalis, an important human pathogen associated with periodontitis and possibly also other conditions, such as rheumatoid arthritis and cardiovascular disease, is active and provides protection from foreign genetic elements. Importantly, the data presented here may be useful for better understanding the communication between cells in larger bacterial communities and, consequently, the process of disease development and progression.
Background: CRISPR/Cas systems allow archaea and bacteria to resist invasion by foreign nucleic acids.
Results: The CRISPR/Cas system in Haloferax recognized six different PAM sequences that could trigger a defense response.
Conclusion: The PAM sequence specificity of the defense response in type I CRISPR systems is more relaxed than previously thought.
Significance: The PAM sequence requirements for interference and adaptation appear to differ markedly.
The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated (Cas) system provides adaptive and heritable immunity against foreign genetic elements in most archaea and many bacteria. Although this system is widespread and diverse with many subtypes, only a few species have been investigated to elucidate the precise mechanisms for the defense of viruses or plasmids. Approximately 90% of all sequenced archaea encode CRISPR/Cas systems, but their molecular details have so far only been examined in three archaeal species: Sulfolobus solfataricus, Sulfolobus islandicus, and Pyrococcus furiosus. Here, we analyzed the CRISPR/Cas system of Haloferax volcanii using a plasmid-based invader assay. Haloferax encodes a type I-B CRISPR/Cas system with eight Cas proteins and three CRISPR loci for which the identity of protospacer adjacent motifs (PAMs) was unknown until now. We identified six different PAM sequences that are required upstream of the protospacer to permit target DNA recognition. This is only the second archaeon for which PAM sequences have been determined, and the first CRISPR group with such a high number of PAM sequences. Cells could survive the plasmid challenge if their CRISPR/Cas system was altered or defective, e.g. by deletion of the cas gene cassette. Experimental PAM data were supplemented with bioinformatics data on Haloferax and Haloquadratum.
Archaea; Microbiology; RNA; RNA Metabolism; RNA Processing; CRISPR/Cas; Haloferax volcanii; PAM
Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) are widely distributed among bacteria. These systems provide adaptive immunity against mobile genetic elements specified by the spacer sequences stored within the CRISPR.
The CRISPR-Cas system has been identified using Basic Local Alignment Search Tool (BLAST) against other sequenced and annotated genomes and confirmed via CRISPRfinder program. Using Polymerase Chain Reactions (PCR) and Sanger DNA sequencing, we discovered CRISPRs in additional bacterial isolates of the same species of Bordetella. Transcriptional activity and processing of the CRISPR have been assessed via RT-PCR.
Here we describe a novel Type II-C CRISPR and its associated genes—cas1, cas2, and cas9—in several isolates of a newly discovered Bordetella species. The CRISPR-cas locus, which is absent in all other Bordetella species, has a significantly lower GC-content than the genome-wide average, suggesting acquisition of this locus via horizontal gene transfer from a currently unknown source. The CRISPR array is transcribed and processed into mature CRISPR RNAs (crRNA), some of which have homology to prophages found in closely related species B. hinzii.
Expression of the CRISPR-Cas system and processing of crRNAs with perfect homology to prophages present in closely related species, but absent in that containing this CRISPR-Cas system, suggest it provides protection against phage predation. The 3,117-bp cas9 endonuclease gene from this novel CRISPR-Cas system is 990 bp smaller than that of Streptococcus pyogenes, the 4,017-bp allele currently used for genome editing, and which may make it a useful tool in various CRISPR-Cas technologies.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-2028-9) contains supplementary material, which is available to authorized users.
Bordetella pseudohinzii; Type II CRISPR; Cas9; SpyCas9; Bacteria; Genome editing; Protospacer; GC-content; HGT
The CRISPR-Cas (Clustered Regularly Interspaced Short Palindrome Repeats – CRISPR associated proteins) system provides adaptive immunity in archaea and bacteria. A hallmark of CRISPR-Cas is the involvement of short crRNAs that guide associated proteins in the destruction of invading DNA or RNA. We present three fundamentally distinct processing pathways in the cyanobacterium Synechocystis sp. PCC6803 for a subtype I-D (CRISPR1), and two type III systems (CRISPR2 and CRISPR3), which are located together on the plasmid pSYSA. Using high-throughput transcriptome analyses and assays of transcript accumulation we found all CRISPR loci to be highly expressed, but the individual crRNAs had profoundly varying abundances despite single transcription start sites for each array. In a computational analysis, CRISPR3 spacers with stable secondary structures displayed a greater ratio of degradation products. These structures might interfere with the loading of the crRNAs into RNP complexes, explaining the varying abundancies. The maturation of CRISPR1 and CRISPR2 transcripts depends on at least two different Cas6 proteins. Mutation of gene sll7090, encoding a Cmr2 protein led to the disappearance of all CRISPR3-derived crRNAs, providing in vivo evidence for a function of Cmr2 in the maturation, regulation of expression, Cmr complex formation or stabilization of CRISPR3 transcripts. Finally, we optimized CRISPR repeat structure prediction and the results indicate that the spacer context can influence individual repeat structures.