Bacteria and archaea face continual onslaughts of rapidly diversifying viruses and plasmids. Many prokaryotes maintain adaptive immune systems known as clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (Cas). CRISPR-Cas systems are genomic sensors that serially acquire viral and plasmid DNA fragments (spacers) that are utilized to target and cleave matching viral and plasmid DNA in subsequent genomic invasions, offering critical immunological memory. Only 50% of sequenced bacteria possess CRISPR-Cas immunity, in contrast to over 90% of sequenced archaea. To probe why half of bacteria lack CRISPR-Cas immunity, we combined comparative genomics and mathematical modeling. Analysis of hundreds of diverse prokaryotic genomes shows that CRISPR-Cas systems are substantially more prevalent in thermophiles than in mesophiles. With sequenced bacteria disproportionately mesophilic and sequenced archaea mostly thermophilic, the presence of CRISPR-Cas appears to depend more on environmental temperature than on bacterial-archaeal taxonomy. Mutation rates are typically severalfold higher in mesophilic prokaryotes than in thermophilic prokaryotes. To quantitatively test whether accelerated viral mutation leads microbes to lose CRISPR-Cas systems, we developed a stochastic model of virus-CRISPR coevolution. The model competes CRISPR-Cas-positive (CRISPR-Cas+) prokaryotes against CRISPR-Cas-negative (CRISPR-Cas−) prokaryotes, continually weighing the antiviral benefits conferred by CRISPR-Cas immunity against its fitness costs. Tracking this cost-benefit analysis across parameter space reveals viral mutation rate thresholds beyond which CRISPR-Cas cannot provide sufficient immunity and is purged from host populations. These results offer a simple, testable viral diversity hypothesis to explain why mesophilic bacteria disproportionately lack CRISPR-Cas immunity. More generally, fundamental limits on the adaptability of biological sensors (Lamarckian evolution) are predicted.
A remarkable recent discovery in microbiology is that bacteria and archaea possess systems conferring immunological memory and adaptive immunity. Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (CRISPR-Cas) are genomic sensors that allow prokaryotes to acquire DNA fragments from invading viruses and plasmids. Providing immunological memory, these stored fragments destroy matching DNA in future viral and plasmid invasions. CRISPR-Cas systems also provide adaptive immunity, keeping up with mutating viruses and plasmids by continually acquiring new DNA fragments. Surprisingly, less than 50% of mesophilic bacteria, in contrast to almost 90% of thermophilic bacteria and Archaea, maintain CRISPR-Cas immunity. Using mathematical modeling, we probe this dichotomy, showing how increased viral mutation rates can explain the reduced prevalence of CRISPR-Cas systems in mesophiles. Rapidly mutating viruses outrun CRISPR-Cas immune systems, likely decreasing their prevalence in bacterial populations. Thus, viral adaptability may select against, rather than for, immune adaptability in prokaryotes.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), together with associated genes (cas), form the CRISPR–cas adaptive immune system, which can provide resistance to viruses and plasmids in bacteria and archaea. Here, we use mathematical models, population dynamic experiments, and DNA sequence analyses to investigate the host–phage interactions in a model CRISPR–cas system, Streptococcus thermophilus DGCC7710 and its virulent phage 2972. At the molecular level, the bacteriophage-immune mutant bacteria (BIMs) and CRISPR–escape mutant phage (CEMs) obtained in this study are consistent with those anticipated from an iterative model of this adaptive immune system: resistance by the addition of novel spacers and phage evasion of resistance by mutation in matching sequences or flanking motifs. While CRISPR BIMs were readily isolated and CEMs generated at high rates (frequencies in excess of 10−6), our population studies indicate that there is more to the dynamics of phage–host interactions and the establishment of a BIM–CEM arms race than predicted from existing assumptions about phage infection and CRISPR–cas immunity. Among the unanticipated observations are: (i) the invasion of phage into populations of BIMs resistant by the acquisition of one (but not two) spacers, (ii) the survival of sensitive bacteria despite the presence of high densities of phage, and (iii) the maintenance of phage-limited communities due to the failure of even two-spacer BIMs to become established in populations with wild-type bacteria and phage. We attribute (i) to incomplete resistance of single-spacer BIMs. Based on the results of additional modeling and experiments, we postulate that (ii) and (iii) can be attributed to the phage infection-associated production of enzymes or other compounds that induce phenotypic phage resistance in sensitive bacteria and kill resistant BIMs. We present evidence in support of these hypotheses and discuss the implications of these results for the ecology and (co)evolution of bacteria and phage.
The evidence that the CRISPR regions of the genomes of archaea and bacteria play a role in the ecology and (co)evolution of these microbes and their viruses is overwhelming: (i) the spacers (variable sequences of 26–72 bp of DNA between the repeats of this region) of these prokaryotes are homologous to the DNA of viruses in their communities; (ii) experimentally, the acquisition and incorporation of spacers of viral DNA can protect these organisms from subsequent infection by these viruses; (iii) experimentally, viruses evade this immunity by mutation in homologous protospacers or protospacer-adjacent motifs (PAMs). Not so clear are the nature and magnitude of the role CRISPR plays in this ecology and evolution. Here, we use mathematical models, experiments with Streptococcus thermophilus and the phage 2972, and DNA sequence analyses to explore the contribution of CRISPR–cas immunity to the ecology and (co)evolution of bacteria and their viruses. The results of this study suggest that the contribution of CRISPR to the ecology of bacteria and phage is more modest and limited, and the conditions for a CRISPR–mediated coevolutionary arms race between these organisms more restrictive, than anticipated from models based on the canonical view of phage infection and CRISPR–cas immunity.
Clustered regularly interspaced short palindromic repeats (CRISPR) are hypervariable loci widely distributed in prokaryotes that provide acquired immunity against foreign genetic elements. Here, we characterize a novel Streptococcus thermophilus locus, CRISPR3, and experimentally demonstrate its ability to integrate novel spacers in response to bacteriophage. Also, we analyze CRISPR diversity and activity across three distinct CRISPR loci in several S. thermophilus strains. We show that both CRISPR repeats and cas genes are locus specific and functionally coupled. A total of 124 strains were studied, and 109 unique spacer arrangements were observed across the three CRISPR loci. Overall, 3,626 spacers were analyzed, including 2,829 for CRISPR1 (782 unique), 173 for CRISPR2 (16 unique), and 624 for CRISPR3 (154 unique). Sequence analysis of the spacers revealed homology and identity to phage sequences (77%), plasmid sequences (16%), and S. thermophilus chromosomal sequences (7%). Polymorphisms were observed for the CRISPR repeats, CRISPR spacers, cas genes, CRISPR motif, locus architecture, and specific sequence content. Interestingly, CRISPR loci evolved both via polarized addition of novel spacers after exposure to foreign genetic elements and via internal deletion of spacers. We hypothesize that the level of diversity is correlated with relative CRISPR activity and propose that the activity is highest for CRISPR1, followed by CRISPR3, while CRISPR2 may be degenerate. Globally, the dynamic nature of CRISPR loci might prove valuable for typing and comparative analyses of strains and microbial populations. Also, CRISPRs provide critical insights into the relationships between prokaryotes and their environments, notably the coevolution of host and viral genomes.
The CRISPR-Cas systems of adaptive antivirus immunity are present in most archaea and many bacteria, and provide resistance to specific viruses or plasmids by inserting fragments of foreign DNA into the host genome and then utilizing transcripts of these spacers to inactivate the cognate foreign genome. The recent development of powerful genome engineering tools on the basis of CRISPR-Cas has sharply increased the interest in the diversity and evolution of these systems. Comparative genomic data indicate that during evolution of prokaryotes CRISPR-Cas loci are lost and acquired via horizontal gene transfer at high rates. Mathematical modeling and initial experimental studies of CRISPR-carrying microbes and viruses reveal complex coevolutionary dynamics.
We performed a bifurcation analysis of models of coevolution of viruses and microbial host that possess CRISPR-Cas hereditary adaptive immunity systems. The analyzed Malthusian and logistic models display complex, and in particular, quasi-chaotic oscillation regimes that have not been previously observed experimentally or in agent-based models of the CRISPR-mediated immunity. The key factors for the appearance of the quasi-chaotic oscillations are the non-linear dependence of the host immunity on the virus load and the partitioning of the hosts into the immune and susceptible populations, so that the system consists of three components.
Bifurcation analysis of CRISPR-host coevolution model predicts complex regimes including quasi-chaotic oscillations. The quasi-chaotic regimes of virus-host coevolution are likely to be biologically relevant given the evolutionary instability of the CRISPR-Cas loci revealed by comparative genomics. The results of this analysis might have implications beyond the CRISPR-Cas systems, i.e. could describe the behavior of any adaptive immunity system with a heritable component, be it genetic or epigenetic. These predictions are experimentally testable.
This manuscript was reviewed by Sandor Pongor, Sergei Maslov and Marek Kimmel. For the complete reports, go to the Reviewers’ Reports section.
Clustered regularly interspaced short palindromic repeats (CRISPR) confer sequence-dependent, adaptive resistance in prokaryotes against viruses and plasmids via incorporation of short sequences, called spacers, derived from foreign genetic elements. CRISPR loci are thus considered to provide records of past infections. To describe the host-parasite (i.e., cyanophages and plasmids) interactions involving the bloom-forming freshwater cyanobacterium Microcystis aeruginosa, we investigated CRISPR in four M. aeruginosa strains and in two previously sequenced genomes. The number of spacers in each locus was larger than the average among prokaryotes. All spacers were strain specific, except for a string of 11 spacers shared in two closely related strains, suggesting diversification of the loci. Using CRISPR repeat-based PCR, 24 CRISPR genotypes were identified in a natural cyanobacterial community. Among 995 unique spacers obtained, only 10 sequences showed similarity to M. aeruginosa phage Ma-LMM01. Of these, six spacers showed only silent or conservative nucleotide mutations compared to Ma-LMM01 sequences, suggesting a strategy by the cyanophage to avert CRISPR immunity dependent on nucleotide identity. These results imply that host-phage interactions can be divided into M. aeruginosa-cyanophage combinations rather than pandemics of population-wide infectious cyanophages. Spacer similarity also showed frequent exposure of M. aeruginosa to small cryptic plasmids that were observed only in a few strains. Thus, the diversification of CRISPR implies that M. aeruginosa has been challenged by diverse communities (almost entirely uncharacterized) of cyanophages and plasmids.
CRISPR arrays and associated cas genes are widespread in bacteria and archaea and confer acquired resistance to viruses. To examine viral immunity in the context of naturally evolving microbial populations we analyzed genomic data from two thermophilic Synechococcus isolates (Syn OS-A and Syn OS-B′) as well as a prokaryotic metagenome and viral metagenome derived from microbial mats in hotsprings at Yellowstone National Park. Two distinct CRISPR types, distinguished by the repeat sequence, are found in both the Syn OS-A and Syn OS-B′ genomes. The genome of Syn OS-A contains a third CRISPR type with a distinct repeat sequence, which is not found in Syn OS-B′, but appears to be shared with other microorganisms that inhabit the mat. The CRISPR repeats identified in the microbial metagenome are highly conserved, while the spacer sequences (hereafter referred to as “viritopes” to emphasize their critical role in viral immunity) were mostly unique and had no high identity matches when searched against GenBank. Searching the viritopes against the viral metagenome, however, yielded several matches with high similarity some of which were within a gene identified as a likely viral lysozyme/lysin protein. Analysis of viral metagenome sequences corresponding to this lysozyme/lysin protein revealed several mutations all of which translate into silent or conservative mutations which are unlikely to affect protein function, but may help the virus evade the host CRISPR resistance mechanism. These results demonstrate the varied challenges presented by a natural virus population, and support the notion that the CRISPR/viritope system must be able to adapt quickly to provide host immunity. The ability of metagenomics to track population-level variation in viritope sequences allows for a culture-independent method for evaluating the fast co-evolution of host and viral genomes and its consequence on the structuring of complex microbial communities.
Discriminating self and non-self is a universal requirement of immune systems. Adaptive immune systems in prokaryotes are centered around repetitive loci called CRISPRs (clustered regularly interspaced short palindromic repeat), into which invader DNA fragments are incorporated. CRISPR transcripts are processed into small RNAs that guide CRISPR-associated (Cas) proteins to invading nucleic acids by complementary base pairing. However, to avoid autoimmunity it is essential that these RNA-guides exclusively target invading DNA and not complementary DNA sequences (i.e., self-sequences) located in the host's own CRISPR locus. Previous work on the Type III-A CRISPR system from Staphylococcus epidermidis has demonstrated that a portion of the CRISPR RNA-guide sequence is involved in self versus non-self discrimination. This self-avoidance mechanism relies on sensing base pairing between the RNA-guide and sequences flanking the target DNA. To determine if the RNA-guide participates in self versus non-self discrimination in the Type I-E system from Escherichia coli we altered base pairing potential between the RNA-guide and the flanks of DNA targets. Here we demonstrate that Type I-E systems discriminate self from non-self through a base pairing-independent mechanism that strictly relies on the recognition of four unchangeable PAM sequences. In addition, this work reveals that the first base pair between the guide RNA and the PAM nucleotide immediately flanking the target sequence can be disrupted without affecting the interference phenotype. Remarkably, this indicates that base pairing at this position is not involved in foreign DNA recognition. Results in this paper reveal that the Type I-E mechanism of avoiding self sequences and preventing autoimmunity is fundamentally different from that employed by Type III-A systems. We propose the exclusive targeting of PAM-flanked sequences to be termed a target versus non-target discrimination mechanism.
CRISPR loci and their associated genes form a diverse set of adaptive immune systems that are widespread among prokaryotes. In these systems, the CRISPR-associated genes (cas) encode for proteins that capture fragments of invading DNA and integrate these sequences between repeat sequences of the host's CRISPR locus. This information is used upon re-infection to degrade invader genomes. Storing invader sequences in host genomes necessitates a mechanism to differentiate between invader sequences on invader genomes and invader sequences on the host genome. CRISPR-Cas of Staphylococcus epidermidis (Type III-A system) is inhibited when invader sequences are flanked by repeat sequences, and this prevents targeting of the CRISPR locus on the host genome. Here we demonstrate that Escherichia coli CRISPR-Cas (Type I-E system) is not inhibited by repeat sequences. Instead, this system is specifically activated by the presence of bona fide Protospacer Adjacent Motifs (PAMs) in the target. PAMs are conserved sequences adjoining invader sequences on the invader genome, and these sequences are never adjacent to invader sequences within host CRISPR loci. PAM recognition is not affected by base pairing potential of the target with the crRNA. As such, the Type I-E system lacks the ability to specifically recognize self DNA.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a prokaryotic adaptive defence system that provides resistance against alien replicons such as viruses and plasmids. Spacers in a CRISPR cassette confer immunity against viruses and plasmids containing regions complementary to the spacers and hence they retain a footprint of interactions between prokaryotes and their viruses in individual strains and ecosystems. The human gut is a rich habitat populated by numerous microorganisms, but a large fraction of these are unculturable and little is known about them in general and their CRISPR systems in particular.
We used human gut metagenomic data from three open projects in order to characterize the composition and dynamics of CRISPR cassettes in the human-associated microbiota. Applying available CRISPR-identification algorithms and a previously designed filtering procedure to the assembled human gut metagenomic contigs, we found 388 CRISPR cassettes, 373 of which had repeats not observed previously in complete genomes or other datasets. Only 171 of 3,545 identified spacers were coupled with protospacers from the human gut metagenomic contigs. The number of matches to GenBank sequences was negligible, providing protospacers for 26 spacers.
Reconstruction of CRISPR cassettes allowed us to track the dynamics of spacer content. In agreement with other published observations we show that spacers shared by different cassettes (and hence likely older ones) tend to the trailer ends, whereas spacers with matches in the metagenomes are distributed unevenly across cassettes, demonstrating a preference to form clusters closer to the active end of a CRISPR cassette, adjacent to the leader, and hence suggesting dynamical interactions between prokaryotes and viruses in the human gut. Remarkably, spacers match protospacers in the metagenome of the same individual with frequency comparable to a random control, but may match protospacers from metagenomes of other individuals.
The analysis of assembled contigs is complementary to the approach based on the analysis of original reads and hence provides additional data about composition and evolution of CRISPR cassettes, revealing the dynamics of CRISPR-phage interactions in metagenomes.
CRISPR; Human gut; Microbiome
The CRISPR-Cas adaptive immunity systems that are present in most Archaea and many Bacteria function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA. Comparative analysis of the Cas protein sequences and structures led to the classification of the CRISPR-Cas systems into three Types (I, II and III).
A detailed comparison of the available sequences and structures of Cas proteins revealed several unnoticed homologous relationships. The Repeat-Associated Mysterious Proteins (RAMPs) containing a distinct form of the RNA Recognition Motif (RRM) domain, which are major components of the CRISPR-Cas systems, were classified into three large groups, Cas5, Cas6 and Cas7. Each of these groups includes many previously uncharacterized proteins now shown to adopt the RAMP structure. Evidence is presented that large subunits contained in most of the CRISPR-Cas systems could be homologous to Cas10 proteins which contain a polymerase-like Palm domain and are predicted to be enzymatically active in Type III CRISPR-Cas systems but inactivated in Type I systems. These findings, the fact that the CRISPR polymerases, RAMPs and Cas2 all contain core RRM domains, and distinct gene arrangements in the three types of CRISPR-Cas systems together provide for a simple scenario for origin and evolution of the CRISPR-Cas machinery. Under this scenario, the CRISPR-Cas system originated in thermophilic Archaea and subsequently spread horizontally among prokaryotes.
Because of the extreme diversity of CRISPR-Cas systems, in-depth sequence and structure comparison continue to reveal unexpected homologous relationship among Cas proteins. Unification of Cas protein families previously considered unrelated provides for improvement in the classification of CRISPR-Cas systems and a reconstruction of their evolution.
Open peer review
This article was reviewed by Malcolm White (nominated by Purficacion Lopez-Garcia), Frank Eisenhaber and Igor Zhulin. For the full reviews, see the Reviewers' Comments section.
Clustered, regularly interspaced short palindromic repeats (CRISPR) provide bacteria and archaea with sequence-specific, acquired defense against plasmids and phage. Because mobile elements constitute up to 25% of the genome of multidrug-resistant (MDR) enterococci, it was of interest to examine the codistribution of CRISPR and acquired antibiotic resistance in enterococcal lineages. A database was built from 16 Enterococcus faecalis draft genome sequences to identify commonalities and polymorphisms in the location and content of CRISPR loci. With this data set, we were able to detect identities between CRISPR spacers and sequences from mobile elements, including pheromone-responsive plasmids and phage, suggesting that CRISPR regulates the flux of these elements through the E. faecalis species. Based on conserved locations of CRISPR and CRISPR-cas loci and the discovery of a new CRISPR locus with associated functional genes, CRISPR3-cas, we screened additional E. faecalis strains for CRISPR content, including isolates predating the use of antibiotics. We found a highly significant inverse correlation between the presence of a CRISPR-cas locus and acquired antibiotic resistance in E. faecalis, and examination of an additional eight E. faecium genomes yielded similar results for that species. A mechanism for CRISPR-cas loss in E. faecalis was identified. The inverse relationship between CRISPR-cas and antibiotic resistance suggests that antibiotic use inadvertently selects for enterococcal strains with compromised genome defense.
For many bacteria, including the opportunistically pathogenic enterococci, antibiotic resistance is mediated by acquisition of new DNA and is frequently encoded on mobile DNA elements such as plasmids and transposons. Certain enterococcal lineages have recently emerged that are characterized by abundant mobile DNA, including numerous viruses (phage), and plasmids and transposons encoding multiple antibiotic resistances. These lineages cause hospital infection outbreaks around the world. The striking influx of mobile DNA into these lineages is in contrast to what would be expected if a self (genome)-defense system was present. Clustered, regularly interspaced short palindromic repeat (CRISPR) defense is a recently discovered mechanism of prokaryotic self-defense that provides a type of acquired immunity. Here, we find that antibiotic resistance and possession of complete CRISPR loci are inversely related and that members of recently emerged high-risk enterococcal lineages lack complete CRISPR loci. Our results suggest that antibiotic therapy inadvertently selects for enterococci with compromised genome defense.
A stochastic, agent-based mathematical model of the coevolution of the archaeal and bacterial adaptive immunity system, CRISPR-Cas, and lytic viruses shows that CRISPR-Cas immunity can stabilize the virus-host coexistence rather than leading to the extinction of the virus. In the model, CRISPR-Cas immunity does not specifically promote viral diversity, presumably because the selection pressure on each single proto-spacer is too weak. However, the overall virus diversity in the presence of CRISPR-Cas grows due to the increase of the host and, accordingly, the virus population size. Above a threshold value of total viral diversity, which is proportional to the viral mutation rate and population size, the CRISPR-Cas system becomes ineffective and is lost due to the associated fitness cost. Our previous modeling study has suggested that the ubiquity of CRISPR-Cas in hyperthermophiles, which contrasts its comparative low prevalence in mesophiles, is due to lower rates of mutation fixation in thermal habitats. The present findings offer a complementary, simpler perspective on this contrast through the larger population sizes of mesophiles compared to hyperthermophiles, because of which CRISPR-Cas can become ineffective in mesophiles. The efficacy of CRISPR-Cas sharply increases with the number of proto-spacers per viral genome, potentially explaining the low information content of the proto-spacer-associated motif (PAM) that is required for spacer acquisition by CRISPR-Cas because a higher specificity would restrict the number of spacers available to CRISPR-Cas, thus hampering immunity. The very existence of the PAM might reflect the tradeoff between the requirement of diverse spacers for efficient immunity and avoidance of autoimmunity.
Prokaryotes thrive in spite of the vast number and diversity of their viruses. This partly results from the evolution of mechanisms to inactivate or silence the action of exogenous DNA. Among these, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are unique in providing adaptive immunity against elements with high local resemblance to genomes of previously infecting agents. Here, we analyze the CRISPR loci of 51 complete genomes of Escherichia and Salmonella. CRISPR are in two pairs of loci in Escherichia, one single pair in Salmonella, each pair showing a similar turnover rate, repeat sequence and putative linkage to a common set of cas genes. Yet, phylogeny shows that CRISPR and associated cas genes have different evolutionary histories, the latter being frequently exchanged or lost. In our set, one CRISPR pair seems specialized in plasmids often matching genes coding for the replication, conjugation and antirestriction machinery. Strikingly, this pair also matches the cognate cas genes in which case these genes are absent. The unexpectedly high conservation of this anti-CRISPR suggests selection to counteract the invasion of mobile elements containing functional CRISPR/cas systems. There are few spacers in most CRISPR, which rarely match genomes of known phages. Furthermore, we found that strains divergent less than 250 thousand years ago show virtually identical CRISPR. The lack of congruence between cas, CRISPR and the species phylogeny and the slow pace of CRISPR change make CRISPR poor epidemiological markers in enterobacteria. All these observations are at odds with the expectedly abundant and dynamic repertoire of spacers in an immune system aiming at protecting bacteria from phages. Since we observe purifying selection for the maintenance of CRISPR these results suggest that alternative evolutionary roles for CRISPR remain to be uncovered.
In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas–mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA–targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.
Bacteria have evolved mechanisms that provide protection from continual invasion by viruses and other foreign elements. Resistance systems, known as CRISPR/Cas, were recently discovered and equip bacteria and archaea with an “adaptive immune system.” This adaptive immunity provides a highly evolvable sequence-specific small RNA–based memory of past invasions by viruses and foreign genetic elements. There are many cases where these systems appear to target regions within the bacterial host's own genome (a possible autoimmunity), but the evolutionary rationale for this is unclear. Here, we demonstrate that CRISPR/Cas targeting of the host chromosome is highly toxic but that cells survive through mutations that alleviate the immune mechanism. We have used this phenotype to gain insight into how these systems function and show that large changes in the bacterial genome can occur. For example, targeting of a chromosomal pathogenicity island, important for virulence of the potato pathogen Pectobacterium atrosepticum, resulted in deletion of the island, which constituted ∼2% of the bacterial genome. These results have broad significance for the role of CRISPR/Cas systems and their impact on the evolution of bacterial genomes and virulence. In addition, this study demonstrates their potential as a tool for the targeted deletion of specific regions of bacterial chromosomes.
CRISPR-Cas systems are one of the most widespread phage resistance mechanisms in prokaryotes. Our lab recently identified the first examples of phage-borne anti-CRISPR genes that encode protein inhibitors of the type I-F CRISPR-Cas system of Pseudomonas aeruginosa. A key question arising from this work was whether there are other types of anti-CRISPR genes. In the current work, we address this question by demonstrating that some of the same phages carrying type I-F anti-CRISPR genes also possess genes that mediate inhibition of the type I-E CRISPR-Cas system of P. aeruginosa. We have discovered four distinct families of these type I-E anti-CRISPR genes. These genes do not inhibit the type I-F CRISPR-Cas system of P. aeruginosa or the type I-E system of Escherichia coli. Type I-E and I-F anti-CRISPR genes are located at the same position in the genomes of a large group of related P. aeruginosa phages, yet they are found in a variety of combinations and arrangements. We have also identified functional anti-CRISPR genes within nonprophage Pseudomonas genomic regions that are likely mobile genetic elements. This work emphasizes the potential importance of anti-CRISPR genes in phage evolution and lateral gene transfer and supports the hypothesis that more undiscovered families of anti-CRISPR genes exist. Finally, we provide the first demonstration that the type I-E CRISPR-Cas system of P. aeruginosa is naturally active without genetic manipulation, which contrasts with E. coli and other previously characterized I-E systems.
The CRISPR-Cas system is an adaptive immune system possessed by the majority of prokaryotic organisms to combat potentially harmful foreign genetic elements. This study reports the discovery of bacteriophage-encoded anti-CRISPR genes that mediate inhibition of a well-studied subtype of CRISPR-Cas system. The four families of anti-CRISPR genes described here, which comprise only the second group of anti-CRISPR genes to be identified, encode small proteins that bear no sequence similarity to previously studied phage or bacterial proteins. Anti-CRISPR genes represent a newly discovered and intriguing facet of the ongoing evolutionary competition between phages and their bacterial hosts.
In bacteria and archaea, viruses are the primary infectious agents, acting as virulent, often deadly pathogens. A form of adaptive immune defense known as CRISPR-Cas enables microbial cells to acquire immunity to viral pathogens by recognizing specific sequences encoded in viral genomes. The unique biology of this system results in evolutionary dynamics of host and viral diversity that cannot be fully explained by the traditional models used to describe microbe-virus coevolutionary dynamics. Here, we show how the CRISPR-mediated adaptive immune response of hosts to invading viruses facilitates the emergence of an evolutionary mode we call distributed immunity - the coexistence of multiple, equally-fit immune alleles among individuals in a microbial population. We use an eco-evolutionary modeling framework to quantify distributed immunity and demonstrate how it emerges and fluctuates in multi-strain communities of hosts and viruses as a consequence of CRISPR-induced coevolution under conditions of low viral mutation and high relative numbers of viral protospacers. We demonstrate that distributed immunity promotes sustained diversity and stability in host communities and decreased viral population density that can lead to viral extinction. We analyze sequence diversity of experimentally coevolving populations of Streptococcus thermophilus and their viruses where CRISPR-Cas is active, and find the rapid emergence of distributed immunity in the host population, demonstrating the importance of this emergent phenomenon in evolving microbial communities.
CRISPR/Cas, bacterial and archaeal systems of interference with foreign genetic elements such as viruses or plasmids, consist of DNA loci called CRISPR cassettes (a set of variable spacers regularly separated by palindromic repeats) and associated cas genes. When a CRISPR spacer sequence exactly matches a sequence in a viral genome, the cell can become resistant to the virus. The CRISPR/Cas systems function through small RNAs originating from longer CRISPR cassette transcripts. While laboratory strains of Escherichia coli contain a functional CRISPR/Cas system (as judged by appearance of phage resistance at conditions of artificial co-overexpression of Cas genes and a CRISPR cassette engineered to target a λ phage), no natural phage resistance due to CRISPR system function was observed in this best-studied organism and no E. coli CRISPR spacer matches sequences of well-studied E. coli phages. To better understand the apparently “silent” E. coli CRISPR/Cas system, we systematically characterized processed transcripts from CRISPR cassettes. Using an engineered strain with genomically located spacer matching phage λ we show that endogenous levels of CRISPR cassette and cas genes expression allow only weak protection against infection with the phage. However, derepression of the CRISPR/Cas system by disruption of the hns gene leads to high level of protection.
CRISPR/Cas is a widespread adaptive immune system in prokaryotes. This system integrates short stretches of DNA derived from invading nucleic acids into genomic CRISPR loci, which function as memory of previously encountered invaders. In Escherichia coli, transcripts of these loci are cleaved into small RNAs and utilized by the Cascade complex to bind invader DNA, which is then likely degraded by Cas3 during CRISPR interference.
We describe how a CRISPR-activated E. coli K12 is cured from a high copy number plasmid under non-selective conditions in a CRISPR-mediated way. Cured clones integrated at least one up to five anti-plasmid spacers in genomic CRISPR loci. New spacers are integrated directly downstream of the leader sequence. The spacers are non-randomly selected to target protospacers with an AAG protospacer adjacent motif, which is located directly upstream of the protospacer. A co-occurrence of PAM deviations and CRISPR repeat mutations was observed, indicating that one nucleotide from the PAM is incorporated as the last nucleotide of the repeat during integration of a new spacer. When multiple spacers were integrated in a single clone, all spacer targeted the same strand of the plasmid, implying that CRISPR interference caused by the first integrated spacer directs subsequent spacer acquisition events in a strand specific manner.
The E. coli Type I-E CRISPR/Cas system provides resistance against bacteriophage infection, but also enables removal of residing plasmids. We established that there is a positive feedback loop between active spacers in a cluster – in our case the first acquired spacer - and spacers acquired thereafter, possibly through the use of specific DNA degradation products of the CRISPR interference machinery by the CRISPR adaptation machinery. This loop enables a rapid expansion of the spacer repertoire against an actively present DNA element that is already targeted, amplifying the CRISPR interference effect.
Clustered regularly interspaced short palindromic repeats (CRISPR) constitute a bacterial and archaeal adaptive immune system that protect against bacteriophage (phage). Analysis of CRISPR loci reveals the history of phage infections and provides a direct link between phage and their hosts. All current tools for CRISPR identification have been developed to analyse completed genomes and are not well suited to the analysis of metagenomic data sets, where CRISPR loci are difficult to assemble owing to their repetitive structure and population heterogeneity. Here, we introduce a new algorithm, Crass, which is designed to identify and reconstruct CRISPR loci from raw metagenomic data without the need for assembly or prior knowledge of CRISPR in the data set. CRISPR in assembled data are often fragmented across many contigs/scaffolds and do not fully represent the population heterogeneity of CRISPR loci. Crass identified substantially more CRISPR in metagenomes previously analysed using assembly-based approaches. Using Crass, we were able to detect CRISPR that contained spacers with sequence homology to phage in the system, which would not have been identified using other approaches. The increased sensitivity, specificity and speed of Crass will facilitate comprehensive analysis of CRISPRs in metagenomic data sets, increasing our understanding of phage-host interactions and co-evolution within microbial communities.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1)1-3. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12 .
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4.
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology; Issue 67; CRISPR/Cas; endonuclease; in vitro transcription; crRNA; Cas6
Clustered regularly interspaced short palindromic repeats (CRISPRs) form a recently characterized type of prokaryotic antiphage defense system. The phage-host interactions involving CRISPRs have been studied in experiments with selected bacterial or archaeal species and, computationally, in completely sequenced genomes. However, these studies do not allow one to take prokaryotic population diversity and phage-host interaction dynamics into account. This gap can be filled by using metagenomic data: in particular, the largest existing data set, generated from the Sorcerer II Global Ocean Sampling expedition. The application of three publicly available CRISPR recognition programs to the Global Ocean metagenome produced a large proportion of false-positive results. To address this problem, a filtering procedure was designed. It resulted in about 200 reliable CRISPR cassettes, which were then studied in detail. The repeat consensuses were clustered into several stable classes that differed from the existing classification. Short fragments of DNA similar to the cassette spacers were more frequently present in the same geographical location than in other locations (P, <0.0001). We developed a catalogue of elementary CRISPR-forming events and reconstructed the likely evolutionary history of cassettes that had common spacers. Metagenomic collections allow for relatively unbiased analysis of phage-host interactions and CRISPR evolution. The results of this study demonstrate that CRISPR cassettes retain the memory of the local virus population at a particular ocean location. CRISPR evolution may be described using a limited vocabulary of elementary events that have a natural biological interpretation.
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
The family of clustered regularly interspaced short palindromic repeats (CRISPRs) describes a class of DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. The DNA repeats do not encode proteins, but appear to be transcribed and processed into small RNAs that may have any number of functions, including resistance to any phage (i.e., virus of bacteria) whose sequence matches a spacer; spacers change rapidly as microbial strains evolve. This work describes 41 new CRISPR-associated (cas) gene families, which are always found near these repeats, in addition to the four previously known. It shows that CRISPR systems belong to different classes, with different repeat patterns, sets of genes, and species ranges. Most of these seem to come and go rather rapidly from their host genomes. These possibly beneficial mobile genetic elements may play an important role in driving prokaryotic evolution.
The immune systems that protect organisms from infectious agents invariably have a cost for the host. In bacteria and archaea CRISPR-Cas loci can serve as adaptive immune systems that protect these microbes from infectiously transmitted DNAs. When those DNAs are borne by lytic viruses (phages), this protection can provide a considerable advantage. CRISPR-Cas immunity can also prevent cells from acquiring plasmids and free DNA bearing genes that increase their fitness. Here, we use a combination of experiments and mathematical-computer simulation models to explore this downside of CRISPR-Cas immunity and its implications for the maintenance of CRISPR-Cas loci in microbial populations. We analyzed the conjugational transfer of the staphylococcal plasmid pG0400 into Staphylococcus epidermidis RP62a recipients that bear a CRISPR-Cas locus targeting this plasmid. Contrary to what is anticipated for lytic phages, which evade CRISPR by mutations in the target region, the evasion of CRISPR immunity by plasmids occurs at the level of the host through loss of functional CRISPR-Cas immunity. The results of our experiments and models indicate that more than 10−4 of the cells in CRISPR-Cas positive populations are defective or deleted for the CRISPR-Cas region and thereby able to receive and carry the plasmid. Most intriguingly, the loss of CRISPR function even by large deletions can have little or no fitness cost in vitro. These theoretical and experimental results can account for the considerable variation in the existence, number and function of CRISPR-Cas loci within and between bacterial species. We postulate that as a consequence of the opposing positive and negative selection for immunity, CRISPR-Cas systems are in a continuous state of flux. They are lost when they bear immunity to laterally transferred beneficial genes, re-acquired by horizontal gene transfer, and ascend in environments where phage are a major source of mortality.
In addition to the virtue of protecting archaea and bacteria from the ravages of lethal viruses (phage), the immunity generated by the CRISPR-Cas systems have an evolutionary downside; they can prevent the acquisition of genes and genetic elements required for the adaptation and even the survival of these microbes. Using mathematical models and experiments with Staphylococcus epidermidis and the staphylococcal conjugative plasmid pG0400, we explore how bacteria deal with this evolutionary downside of CRISPR-Cas immunity. Although there are mechanisms by which immune populations of bacteria can acquire essential plasmids without the loss of CRISPR-Cas immunity, the results of our conjugation and fitness cost experiments suggest the most likely mechanism is the deactivation and deletion of this region. These results provide an explanation for the considerable variation in the existence, number and function of CRISPR-Cas within and between species of microbes. Along with other observations our work also suggests that the CRISPR-Cas loci are in a continuous state of flux: acquired by horizontal gene transfer, ascend when populations are confronted with phage and are rapidly lost when infectiously transmitted genes and genetic elements are required for the adaptation and survival of the population.
A facile and efficient method for the precise editing of large viral genomes is required for the selection of attenuated vaccine strains and the construction of gene therapy vectors. The type II prokaryotic CRISPR-Cas (clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas)) RNA-guided nuclease system can be introduced into host cells during viral replication. The CRISPR-Cas9 system robustly stimulates targeted double-stranded breaks in the genomes of DNA viruses, where the non-homologous end joining (NHEJ) and homology-directed repair (HDR) pathways can be exploited to introduce site-specific indels or insert heterologous genes with high frequency. Furthermore, CRISPR-Cas9 can specifically inhibit the replication of the original virus, thereby significantly increasing the abundance of the recombinant virus among progeny virus. As a result, purified recombinant virus can be obtained with only a single round of selection. In this study, we used recombinant adenovirus and type I herpes simplex virus as examples to demonstrate that the CRISPR-Cas9 system is a valuable tool for editing the genomes of large DNA viruses.
The clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) system was discovered as a component of the bacterial acquired immune system that cleaves foreign DNA. This system is now used for site-specific genome editing in a wide range of organisms, including bacteria, yeasts, plants, and animals. However, the use of this approach in non-cell organisms, such as non-integrating viruses, has not been reported. Because multiple steps are required to construct mutant or recombinant DNA viruses with large genomes using the current approaches, we used the CRISPR-Cas9 system to introduce site-specific indels and insert a foreign gene into an adenoviral vector and wild-type herpes simplex virus. The high efficiency of CRISPR-Cas9 editing allowed for simple construction and purification of recombinant progeny virus. We believe that this new technique will have broad practical significance for selecting attenuated vaccine strains and antiviral drugs, constructing gene therapy vectors, and establishing efficient methods for viral biological studies.
Gardnerella vaginalis is identified as the predominant colonist of the vaginal tracts of women diagnosed with bacterial vaginosis (BV). G. vaginalis can be isolated from healthy women, and an asymptomatic BV state is also recognised. The association of G. vaginalis with different clinical phenotypes could be explained by different cytotoxicity of the strains, presumably based on disparate gene content. The contribution of horizontal gene transfer to shaping the genomes of G. vaginalis is acknowledged. The CRISPR loci of the recently discovered CRISPR/Cas microbial defence system provide a historical view of the exposure of prokaryotes to a variety of foreign genetic elements.
The CRISPR/Cas loci were analysed using available sequence data from three G. vaginalis complete genomes and 18 G. vaginalis draft genomes in the NCBI database, as well as PCR amplicons of the genomic DNA of 17 clinical isolates. The cas genes in the CRISPR/Cas loci of G. vaginalis belong to the E. coli subtype. Approximately 20% of the spacers had matches in the GenBank database. Sequence analysis of the CRISPR arrays revealed that nearly half of the spacers matched G. vaginalis chromosomal sequences. The spacers that matched G. vaginalis chromosomal sequences were determined to not be self-targeting and were presumably neither constituents of mobile-element-associated genes nor derived from plasmids/viruses. The protospacers targeted by these spacers displayed conserved protospacer-adjacent motifs.
The CRISPR/Cas system has been identified in about one half of the analysed G. vaginalis strains. Our analysis of CRISPR sequences did not reveal a potential link between their presence and the virulence of the G. vaginalis strains. Based on the origins of the spacers found in the G. vaginalis CRISPR arrays, we hypothesise that the transfer of genetic material among G. vaginalis strains could be regulated by the CRISPR/Cas mechanism. The present study is the first attempt to determine and analyse the CRISPR loci of bacteria isolated from the human vaginal tract.
Gardnerella vaginalis; Bacterial vaginosis; CRISPR/Cas; Spacer; Repeat; PAM