Streptococcus pyogenes, one of the major human pathogens, is a unique species since it has acquired diverse strain-specific virulence properties mainly through the acquisition of streptococcal prophages. In addition, S. pyogenes possesses clustered regularly interspaced short palindromic repeats (CRISPR)/Cas systems that can restrict horizontal gene transfer (HGT) including phage insertion. Therefore, it was of interest to examine the relationship between CRISPR and acquisition of prophages in S. pyogenes. Although two distinct CRISPR loci were found in S. pyogenes, some strains lacked CRISPR and these strains possess significantly more prophages than CRISPR harboring strains. We also found that the number of spacers of S. pyogenes CRISPR was less than for other streptococci. The demonstrated spacer contents, however, suggested that the CRISPR appear to limit phage insertions. In addition, we found a significant inverse correlation between the number of spacers and prophages in S. pyogenes. It was therefore suggested that S. pyogenes CRISPR have permitted phage insertion by lacking its own spacers. Interestingly, in two closely related S. pyogenes strains (SSI-1 and MGAS315), CRISPR activity appeared to be impaired following the insertion of phage genomes into the repeat sequences. Detailed analysis of this prophage insertion site suggested that MGAS315 is the ancestral strain of SSI-1. As a result of analysis of 35 additional streptococcal genomes, it was suggested that the influences of the CRISPR on the phage insertion vary among species even within the same genus. Our results suggested that limitations in CRISPR content could explain the characteristic acquisition of prophages and might contribute to strain-specific pathogenesis in S. pyogenes.
The categorisation and structural analysis of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) sequences from 195 microbial genomes show that repeats from diverse organisms can be grouped based on sequence similarity, and that some groups have pronounced secondary structures with compensatory base changes.
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in approximately 40% of bacterial and most archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CASs), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been recently shown that CRISPR provides acquired resistance against viruses in prokaryotes.
Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. Some of the clusters present stable, highly conserved RNA secondary structures, while others lack detectable structures. Stable secondary structures exhibit multiple compensatory base changes in the stem region, indicating evolutionary and functional conservation.
We show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification, including specific relationships between CRISPR and CAS subtypes.
Yersinia pestis, the pathogen of plague, has greatly influenced human history on a global scale. Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR), an element participating in immunity against phages' invasion, is composed of short repeated sequences separated by unique spacers and provides the basis of the spoligotyping technology. In the present research, three CRISPR loci were analyzed in 125 strains of Y. pestis from 26 natural plague foci of China, the former Soviet Union and Mongolia were analyzed, for validating CRISPR-based genotyping method and better understanding adaptive microevolution of Y. pestis.
Using PCR amplification, sequencing and online data processing, a high degree of genetic diversity was revealed in all three CRISPR elements. The distribution of spacers and their arrays in Y. pestis strains is strongly region and focus-specific, allowing the construction of a hypothetic evolutionary model of Y. pestis. This model suggests transmission route of microtus strains that encircled Takla Makan Desert and ZhunGer Basin. Starting from Tadjikistan, one branch passed through the Kunlun Mountains, and moved to the Qinghai-Tibet Plateau. Another branch went north via the Pamirs Plateau, the Tianshan Mountains, the Altai Mountains and the Inner Mongolian Plateau. Other Y. pestis lineages might be originated from certain areas along those routes.
CRISPR can provide important information for genotyping and evolutionary research of bacteria, which will help to trace the source of outbreaks. The resulting data will make possible the development of very low cost and high-resolution assays for the systematic typing of any new isolate.
Clustered regularly interspaced short palindromic repeats (CRISPR) confer sequence-dependent, adaptive resistance in prokaryotes against viruses and plasmids via incorporation of short sequences, called spacers, derived from foreign genetic elements. CRISPR loci are thus considered to provide records of past infections. To describe the host-parasite (i.e., cyanophages and plasmids) interactions involving the bloom-forming freshwater cyanobacterium Microcystis aeruginosa, we investigated CRISPR in four M. aeruginosa strains and in two previously sequenced genomes. The number of spacers in each locus was larger than the average among prokaryotes. All spacers were strain specific, except for a string of 11 spacers shared in two closely related strains, suggesting diversification of the loci. Using CRISPR repeat-based PCR, 24 CRISPR genotypes were identified in a natural cyanobacterial community. Among 995 unique spacers obtained, only 10 sequences showed similarity to M. aeruginosa phage Ma-LMM01. Of these, six spacers showed only silent or conservative nucleotide mutations compared to Ma-LMM01 sequences, suggesting a strategy by the cyanophage to avert CRISPR immunity dependent on nucleotide identity. These results imply that host-phage interactions can be divided into M. aeruginosa-cyanophage combinations rather than pandemics of population-wide infectious cyanophages. Spacer similarity also showed frequent exposure of M. aeruginosa to small cryptic plasmids that were observed only in a few strains. Thus, the diversification of CRISPR implies that M. aeruginosa has been challenged by diverse communities (almost entirely uncharacterized) of cyanophages and plasmids.
In order to get further insights into the role of the clustered, regularly interspaced, short palindromic repeats (CRISPRs) in Escherichia coli, we analyzed the CRISPR diversity in a collection of 290 strains, in the phylogenetic framework of the strains represented by multilocus sequence typing (MLST). The set included 263 natural E. coli isolates exposed to various environments and isolated over a 20-year period from humans and animals, as well as 27 fully sequenced strains. Our analyses confirm that there are two largely independent pairs of CRISPR loci (CRISPR1 and -2 and CRISPR3 and -4), each associated with a different type of cas genes (Ecoli and Ypest, respectively), but that each pair of CRISPRs has similar dynamics. Strikingly, the major phylogenetic group B2 is almost devoid of CRISPRs. The majority of genomes analyzed lack Ypest cas genes and contain CRISPR3 with spacers matching Ypest cas genes. The analysis of relatedness between strains in terms of spacer repertoire and the MLST tree shows a pattern where closely related strains (MLST phylogenetic distance of <0.005 corresponding to at least hundreds of thousands of years) often exhibit identical CRISPRs while more distantly related strains (MLST distance of >0.01) exhibit completely different CRISPRs. This suggests rare but radical turnover of spacers in CRISPRs rather than CRISPR gradual change. We found no link between the presence, size, or content of CRISPRs and the lifestyle of the strains. Our data suggest that, within the E. coli species, CRISPRs do not have the expected characteristics of a classical immune system.
Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated genes are linked to a mechanism of acquired resistance against bacteriophages. Bacteria can integrate short stretches of phage-derived sequences (spacers) within CRISPR loci to become phage resistant. In this study, we further characterized the efficiency of CRISPR1 as a phage resistance mechanism in Streptococcus thermophilus. First, we show that CRISPR1 is distinct from previously known phage defense systems and is effective against the two main groups of S. thermophilus phages. Analyses of 30 bacteriophage-insensitive mutants of S. thermophilus indicate that the addition of one new spacer in CRISPR1 is the most frequent outcome of a phage challenge and that the iterative addition of spacers increases the overall phage resistance of the host. The added new spacers have a size of between 29 to 31 nucleotides, with 30 being by far the most frequent. Comparative analysis of 39 newly acquired spacers with the complete genomic sequences of the wild-type phages 2972, 858, and DT1 demonstrated that the newly added spacer must be identical to a region (named proto-spacer) in the phage genome to confer a phage resistance phenotype. Moreover, we found a CRISPR1-specific sequence (NNAGAAW) located downstream of the proto-spacer region that is important for the phage resistance phenotype. Finally, we show through the analyses of 20 mutant phages that virulent phages are rapidly evolving through single nucleotide mutations as well as deletions, in response to CRISPR1.
The CRISPR–Cas (clustered regularly interspaced short palindromic repeats–CRISPR-associated proteins) modules are adaptive immunity systems that are present in many archaea and bacteria. These defence systems are encoded by operons that have an extraordinarily diverse architecture and a high rate of evolution for both the cas genes and the unique spacer content. Here, we provide an updated analysis of the evolutionary relationships between CRISPR–Cas systems and Cas proteins. Three major types of CRISPR–Cas system are delineated, with a further division into several subtypes and a few chimeric variants. Given the complexity of the genomic architectures and the extremely dynamic evolution of the CRISPR–Cas systems, a unified classification of these systems should be based on multiple criteria. Accordingly, we propose a `polythetic' classification that integrates the phylogenies of the most common cas genes, the sequence and organization of the CRISPR repeats and the architecture of the CRISPR–cas loci.
Streptococcus thermophilus, similar to other Bacteria and Archaea, has developed defense mechanisms to protect cells against invasion by foreign nucleic acids, such as virus infections and plasmid transformations. One defense system recently described in these organisms is the CRISPR-Cas system (Clustered Regularly Interspaced Short Palindromic Repeats loci coupled to CRISPR-associated genes). Two S. thermophilus CRISPR-Cas systems, CRISPR1-Cas and CRISPR3-Cas, have been shown to actively block phage infection. The CRISPR1-Cas system interferes by cleaving foreign dsDNA entering the cell in a length-specific and orientation-dependant manner. Here, we show that the S. thermophilus CRISPR3-Cas system acts by cleaving phage dsDNA genomes at the same specific position inside the targeted protospacer as observed with the CRISPR1-Cas system. Only one cleavage site was observed in all tested strains. Moreover, we observed that the CRISPR1-Cas and CRISPR3-Cas systems are compatible and, when both systems are present within the same cell, provide increased resistance against phage infection by both cleaving the invading dsDNA. We also determined that overall phage resistance efficiency is correlated to the total number of newly acquired spacers in both CRISPR loci.
Clustered, regularly interspaced short palindromic repeats (CRISPR) provide bacteria and archaea with sequence-specific, acquired defense against plasmids and phage. Because mobile elements constitute up to 25% of the genome of multidrug-resistant (MDR) enterococci, it was of interest to examine the codistribution of CRISPR and acquired antibiotic resistance in enterococcal lineages. A database was built from 16 Enterococcus faecalis draft genome sequences to identify commonalities and polymorphisms in the location and content of CRISPR loci. With this data set, we were able to detect identities between CRISPR spacers and sequences from mobile elements, including pheromone-responsive plasmids and phage, suggesting that CRISPR regulates the flux of these elements through the E. faecalis species. Based on conserved locations of CRISPR and CRISPR-cas loci and the discovery of a new CRISPR locus with associated functional genes, CRISPR3-cas, we screened additional E. faecalis strains for CRISPR content, including isolates predating the use of antibiotics. We found a highly significant inverse correlation between the presence of a CRISPR-cas locus and acquired antibiotic resistance in E. faecalis, and examination of an additional eight E. faecium genomes yielded similar results for that species. A mechanism for CRISPR-cas loss in E. faecalis was identified. The inverse relationship between CRISPR-cas and antibiotic resistance suggests that antibiotic use inadvertently selects for enterococcal strains with compromised genome defense.
For many bacteria, including the opportunistically pathogenic enterococci, antibiotic resistance is mediated by acquisition of new DNA and is frequently encoded on mobile DNA elements such as plasmids and transposons. Certain enterococcal lineages have recently emerged that are characterized by abundant mobile DNA, including numerous viruses (phage), and plasmids and transposons encoding multiple antibiotic resistances. These lineages cause hospital infection outbreaks around the world. The striking influx of mobile DNA into these lineages is in contrast to what would be expected if a self (genome)-defense system was present. Clustered, regularly interspaced short palindromic repeat (CRISPR) defense is a recently discovered mechanism of prokaryotic self-defense that provides a type of acquired immunity. Here, we find that antibiotic resistance and possession of complete CRISPR loci are inversely related and that members of recently emerged high-risk enterococcal lineages lack complete CRISPR loci. Our results suggest that antibiotic therapy inadvertently selects for enterococci with compromised genome defense.
Prokaryotes thrive in spite of the vast number and diversity of their viruses. This partly results from the evolution of mechanisms to inactivate or silence the action of exogenous DNA. Among these, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are unique in providing adaptive immunity against elements with high local resemblance to genomes of previously infecting agents. Here, we analyze the CRISPR loci of 51 complete genomes of Escherichia and Salmonella. CRISPR are in two pairs of loci in Escherichia, one single pair in Salmonella, each pair showing a similar turnover rate, repeat sequence and putative linkage to a common set of cas genes. Yet, phylogeny shows that CRISPR and associated cas genes have different evolutionary histories, the latter being frequently exchanged or lost. In our set, one CRISPR pair seems specialized in plasmids often matching genes coding for the replication, conjugation and antirestriction machinery. Strikingly, this pair also matches the cognate cas genes in which case these genes are absent. The unexpectedly high conservation of this anti-CRISPR suggests selection to counteract the invasion of mobile elements containing functional CRISPR/cas systems. There are few spacers in most CRISPR, which rarely match genomes of known phages. Furthermore, we found that strains divergent less than 250 thousand years ago show virtually identical CRISPR. The lack of congruence between cas, CRISPR and the species phylogeny and the slow pace of CRISPR change make CRISPR poor epidemiological markers in enterobacteria. All these observations are at odds with the expectedly abundant and dynamic repertoire of spacers in an immune system aiming at protecting bacteria from phages. Since we observe purifying selection for the maintenance of CRISPR these results suggest that alternative evolutionary roles for CRISPR remain to be uncovered.
Clustered regularly interspaced short palindromic repeat (CRISPR) elements are a particular family of tandem repeats present in prokaryotic genomes, in almost all archaea and in about half of bacteria, and which participate in a mechanism of acquired resistance against phages. They consist in a succession of direct repeats (DR) of 24–47 bp separated by similar sized unique sequences (spacers). In the large majority of cases, the direct repeats are highly conserved, while the number and nature of the spacers are often quite diverse, even among strains of a same species. Furthermore, the acquisition of new units (DR + spacer) was shown to happen almost exclusively on one side of the locus. Therefore, the CRISPR presents an interesting genetic marker for comparative and evolutionary analysis of closely related bacterial strains. CRISPRcompar is a web service created to assist biologists in the CRISPR typing process. Two tools facilitates the in silico investigation: CRISPRcomparison and CRISPRtionary. This website is freely accessible at http://crispr.u-psud.fr/CRISPRcompar/.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), together with associated genes (cas), form the CRISPR–cas adaptive immune system, which can provide resistance to viruses and plasmids in bacteria and archaea. Here, we use mathematical models, population dynamic experiments, and DNA sequence analyses to investigate the host–phage interactions in a model CRISPR–cas system, Streptococcus thermophilus DGCC7710 and its virulent phage 2972. At the molecular level, the bacteriophage-immune mutant bacteria (BIMs) and CRISPR–escape mutant phage (CEMs) obtained in this study are consistent with those anticipated from an iterative model of this adaptive immune system: resistance by the addition of novel spacers and phage evasion of resistance by mutation in matching sequences or flanking motifs. While CRISPR BIMs were readily isolated and CEMs generated at high rates (frequencies in excess of 10−6), our population studies indicate that there is more to the dynamics of phage–host interactions and the establishment of a BIM–CEM arms race than predicted from existing assumptions about phage infection and CRISPR–cas immunity. Among the unanticipated observations are: (i) the invasion of phage into populations of BIMs resistant by the acquisition of one (but not two) spacers, (ii) the survival of sensitive bacteria despite the presence of high densities of phage, and (iii) the maintenance of phage-limited communities due to the failure of even two-spacer BIMs to become established in populations with wild-type bacteria and phage. We attribute (i) to incomplete resistance of single-spacer BIMs. Based on the results of additional modeling and experiments, we postulate that (ii) and (iii) can be attributed to the phage infection-associated production of enzymes or other compounds that induce phenotypic phage resistance in sensitive bacteria and kill resistant BIMs. We present evidence in support of these hypotheses and discuss the implications of these results for the ecology and (co)evolution of bacteria and phage.
The evidence that the CRISPR regions of the genomes of archaea and bacteria play a role in the ecology and (co)evolution of these microbes and their viruses is overwhelming: (i) the spacers (variable sequences of 26–72 bp of DNA between the repeats of this region) of these prokaryotes are homologous to the DNA of viruses in their communities; (ii) experimentally, the acquisition and incorporation of spacers of viral DNA can protect these organisms from subsequent infection by these viruses; (iii) experimentally, viruses evade this immunity by mutation in homologous protospacers or protospacer-adjacent motifs (PAMs). Not so clear are the nature and magnitude of the role CRISPR plays in this ecology and evolution. Here, we use mathematical models, experiments with Streptococcus thermophilus and the phage 2972, and DNA sequence analyses to explore the contribution of CRISPR–cas immunity to the ecology and (co)evolution of bacteria and their viruses. The results of this study suggest that the contribution of CRISPR to the ecology of bacteria and phage is more modest and limited, and the conditions for a CRISPR–mediated coevolutionary arms race between these organisms more restrictive, than anticipated from models based on the canonical view of phage infection and CRISPR–cas immunity.
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
The family of clustered regularly interspaced short palindromic repeats (CRISPRs) describes a class of DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. The DNA repeats do not encode proteins, but appear to be transcribed and processed into small RNAs that may have any number of functions, including resistance to any phage (i.e., virus of bacteria) whose sequence matches a spacer; spacers change rapidly as microbial strains evolve. This work describes 41 new CRISPR-associated (cas) gene families, which are always found near these repeats, in addition to the four previously known. It shows that CRISPR systems belong to different classes, with different repeat patterns, sets of genes, and species ranges. Most of these seem to come and go rather rapidly from their host genomes. These possibly beneficial mobile genetic elements may play an important role in driving prokaryotic evolution.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci, together with cas (CRISPR–associated) genes, form the CRISPR/Cas adaptive immune system, a primary defense strategy that eubacteria and archaea mobilize against foreign nucleic acids, including phages and conjugative plasmids. Short spacer sequences separated by the repeats are derived from foreign DNA and direct interference to future infections. The availability of hundreds of shotgun metagenomic datasets from the Human Microbiome Project (HMP) enables us to explore the distribution and diversity of known CRISPRs in human-associated microbial communities and to discover new CRISPRs. We propose a targeted assembly strategy to reconstruct CRISPR arrays, which whole-metagenome assemblies fail to identify. For each known CRISPR type (identified from reference genomes), we use its direct repeat consensus sequence to recruit reads from each HMP dataset and then assemble the recruited reads into CRISPR loci; the unique spacer sequences can then be extracted for analysis. We also identified novel CRISPRs or new CRISPR variants in contigs from whole-metagenome assemblies and used targeted assembly to more comprehensively identify these CRISPRs across samples. We observed that the distributions of CRISPRs (including 64 known and 86 novel ones) are largely body-site specific. We provide detailed analysis of several CRISPR loci, including novel CRISPRs. For example, known streptococcal CRISPRs were identified in most oral microbiomes, totaling ∼8,000 unique spacers: samples resampled from the same individual and oral site shared the most spacers; different oral sites from the same individual shared significantly fewer, while different individuals had almost no common spacers, indicating the impact of subtle niche differences on the evolution of CRISPR defenses. We further demonstrate potential applications of CRISPRs to the tracing of rare species and the virus exposure of individuals. This work indicates the importance of effective identification and characterization of CRISPR loci to the study of the dynamic ecology of microbiomes.
Human bodies are complex ecological systems in which various microbial organisms and viruses interact with each other and with the human host. The Human Microbiome Project (HMP) has resulted in >700 datasets of shotgun metagenomic sequences, from which we can learn about the compositions and functions of human-associated microbial communities. CRISPR/Cas systems are a widespread class of adaptive immune systems in bacteria and archaea, providing acquired immunity against foreign nucleic acids: CRISPR/Cas defense pathways involve integration of viral- or plasmid-derived DNA segments into CRISPR arrays (forming spacers between repeated structural sequences), and expression of short crRNAs from these single repeat-spacer units, to generate interference to future invading foreign genomes. Powered by an effective computational approach (the targeted assembly approach for CRISPR), our analysis of CRISPR arrays in the HMP datasets provides the very first global view of bacterial immunity systems in human-associated microbial communities. The great diversity of CRISPR spacers we observed among different body sites, in different individuals, and in single individuals over time, indicates the impact of subtle niche differences on the evolution of CRISPR defenses and indicates the key role of bacteriophage (and plasmids) in shaping human microbial communities.
The clustered regularly interspaced short palindromic repeat (CRISPR)/Cas system confers acquired heritable immunity against mobile nucleic acid elements in prokaryotes, limiting phage infection and horizontal gene transfer of plasmids. In CRISPR arrays, characteristic repeats are interspersed with similarly sized nonrepetitive spacers derived from transmissible genetic elements and acquired when the cell is challenged with foreign DNA. New spacers are added sequentially and the number and type of CRISPR units can differ among strains, providing a record of phage/plasmid exposure within a species and giving a valuable typing tool. The aim of this work was to investigate CRISPR diversity in the highly homogeneous species Erwinia amylovora, the causal agent of fire blight. A total of 18 CRISPR genotypes were defined within a collection of 37 cosmopolitan strains. Strains from Spiraeoideae plants clustered in three major groups: groups II and III were composed exclusively of bacteria originating from the United States, whereas group I generally contained strains of more recent dissemination obtained in Europe, New Zealand, and the Middle East. Strains from Rosoideae and Indian hawthorn (Rhaphiolepis indica) clustered separately and displayed a higher intrinsic diversity than that of isolates from Spiraeoideae plants. Reciprocal exclusion was generally observed between plasmid content and cognate spacer sequences, supporting the role of the CRISPR/Cas system in protecting against foreign DNA elements. However, in several group III strains, retention of plasmid pEU30 is inconsistent with a functional CRISPR/Cas system.
CRISPR/Cas, bacterial and archaeal systems of interference with foreign genetic elements such as viruses or plasmids, consist of DNA loci called CRISPR cassettes (a set of variable spacers regularly separated by palindromic repeats) and associated cas genes. When a CRISPR spacer sequence exactly matches a sequence in a viral genome, the cell can become resistant to the virus. The CRISPR/Cas systems function through small RNAs originating from longer CRISPR cassette transcripts. While laboratory strains of Escherichia coli contain a functional CRISPR/Cas system (as judged by appearance of phage resistance at conditions of artificial co-overexpression of Cas genes and a CRISPR cassette engineered to target a λ phage), no natural phage resistance due to CRISPR system function was observed in this best-studied organism and no E. coli CRISPR spacer matches sequences of well-studied E. coli phages. To better understand the apparently “silent” E. coli CRISPR/Cas system, we systematically characterized processed transcripts from CRISPR cassettes. Using an engineered strain with genomically located spacer matching phage λ we show that endogenous levels of CRISPR cassette and cas genes expression allow only weak protection against infection with the phage. However, derepression of the CRISPR/Cas system by disruption of the hns gene leads to high level of protection.
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci have been shown to provide prokaryotes with an adaptive immunity against viruses and plasmids. CRISPR arrays are transcribed and processed into small CRISPR RNA molecules, which base-pair with invading DNA or RNA and lead to its degradation by CRISPR-associated (Cas) protein complexes. New spacers can be acquired by active CRISPR/Cas systems, and thus the sequences of these spacers provide a record of the past “infection history” of the organism. Recently we used spacer sequences from archaeal genomes to infer gene exchange events among archaeal species and genera and to demonstrate that at least in this domain of life CRISPR indeed has an anti-viral role.
CRISPR; Lateral Gene Transfer; archaea; horizontal gene transfer; viruses
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
Clustered regularly interspaced short palindromic repeats (CRISPRs) comprise a family of short DNA repeat sequences that are separated by non repetitive spacer sequences and, in combination with a suite of Cas proteins, are thought to function as an adaptive immune system against invading DNA. The number of CRISPR arrays in a bacterial chromosome is variable, and the content of each array can differ in both repeat number and in the presence or absence of specific spacers. We utilized a comparative sequence analysis of CRISPR arrays of the plant pathogen Erwinia amylovora to uncover previously unknown genetic diversity in this species. A total of 85 E. amylovora strains varying in geographic isolation (North America, Europe, New Zealand, and the Middle East), host range, plasmid content, and streptomycin sensitivity/resistance were evaluated for CRISPR array number and spacer variability. From these strains, 588 unique spacers were identified in the three CRISPR arrays present in E. amylovora, and these arrays could be categorized into 20, 17, and 2 patterns types, respectively. Analysis of the relatedness of spacer content differentiated most apple and pear strains isolated in the eastern U.S. from western U.S. strains. In addition, we identified North American strains that shared CRISPR genotypes with strains isolated on other continents. E. amylovora strains from Rubus and Indian hawthorn contained mostly unique spacers compared to apple and pear strains, while strains from loquat shared 79% of spacers with apple and pear strains. Approximately 23% of the spacers matched known sequences, with 16% targeting plasmids and 5% targeting bacteriophage. The plasmid pEU30, isolated in E. amylovora strains from the western U.S., was targeted by 55 spacers. Lastly, we used spacer patterns and content to determine that streptomycin-resistant strains of E. amylovora from Michigan were low in diversity and matched corresponding streptomycin-sensitive strains from the background population.
Bacteria and archaea face continual onslaughts of rapidly diversifying viruses and plasmids. Many prokaryotes maintain adaptive immune systems known as clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (Cas). CRISPR-Cas systems are genomic sensors that serially acquire viral and plasmid DNA fragments (spacers) that are utilized to target and cleave matching viral and plasmid DNA in subsequent genomic invasions, offering critical immunological memory. Only 50% of sequenced bacteria possess CRISPR-Cas immunity, in contrast to over 90% of sequenced archaea. To probe why half of bacteria lack CRISPR-Cas immunity, we combined comparative genomics and mathematical modeling. Analysis of hundreds of diverse prokaryotic genomes shows that CRISPR-Cas systems are substantially more prevalent in thermophiles than in mesophiles. With sequenced bacteria disproportionately mesophilic and sequenced archaea mostly thermophilic, the presence of CRISPR-Cas appears to depend more on environmental temperature than on bacterial-archaeal taxonomy. Mutation rates are typically severalfold higher in mesophilic prokaryotes than in thermophilic prokaryotes. To quantitatively test whether accelerated viral mutation leads microbes to lose CRISPR-Cas systems, we developed a stochastic model of virus-CRISPR coevolution. The model competes CRISPR-Cas-positive (CRISPR-Cas+) prokaryotes against CRISPR-Cas-negative (CRISPR-Cas−) prokaryotes, continually weighing the antiviral benefits conferred by CRISPR-Cas immunity against its fitness costs. Tracking this cost-benefit analysis across parameter space reveals viral mutation rate thresholds beyond which CRISPR-Cas cannot provide sufficient immunity and is purged from host populations. These results offer a simple, testable viral diversity hypothesis to explain why mesophilic bacteria disproportionately lack CRISPR-Cas immunity. More generally, fundamental limits on the adaptability of biological sensors (Lamarckian evolution) are predicted.
A remarkable recent discovery in microbiology is that bacteria and archaea possess systems conferring immunological memory and adaptive immunity. Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (CRISPR-Cas) are genomic sensors that allow prokaryotes to acquire DNA fragments from invading viruses and plasmids. Providing immunological memory, these stored fragments destroy matching DNA in future viral and plasmid invasions. CRISPR-Cas systems also provide adaptive immunity, keeping up with mutating viruses and plasmids by continually acquiring new DNA fragments. Surprisingly, less than 50% of mesophilic bacteria, in contrast to almost 90% of thermophilic bacteria and Archaea, maintain CRISPR-Cas immunity. Using mathematical modeling, we probe this dichotomy, showing how increased viral mutation rates can explain the reduced prevalence of CRISPR-Cas systems in mesophiles. Rapidly mutating viruses outrun CRISPR-Cas immune systems, likely decreasing their prevalence in bacterial populations. Thus, viral adaptability may select against, rather than for, immune adaptability in prokaryotes.
The CRISPR-Cas (Clustered Regularly Interspaced Short Palindrome Repeats – CRISPR associated proteins) system provides adaptive immunity in archaea and bacteria. A hallmark of CRISPR-Cas is the involvement of short crRNAs that guide associated proteins in the destruction of invading DNA or RNA. We present three fundamentally distinct processing pathways in the cyanobacterium Synechocystis sp. PCC6803 for a subtype I-D (CRISPR1), and two type III systems (CRISPR2 and CRISPR3), which are located together on the plasmid pSYSA. Using high-throughput transcriptome analyses and assays of transcript accumulation we found all CRISPR loci to be highly expressed, but the individual crRNAs had profoundly varying abundances despite single transcription start sites for each array. In a computational analysis, CRISPR3 spacers with stable secondary structures displayed a greater ratio of degradation products. These structures might interfere with the loading of the crRNAs into RNP complexes, explaining the varying abundancies. The maturation of CRISPR1 and CRISPR2 transcripts depends on at least two different Cas6 proteins. Mutation of gene sll7090, encoding a Cmr2 protein led to the disappearance of all CRISPR3-derived crRNAs, providing in vivo evidence for a function of Cmr2 in the maturation, regulation of expression, Cmr complex formation or stabilization of CRISPR3 transcripts. Finally, we optimized CRISPR repeat structure prediction and the results indicate that the spacer context can influence individual repeat structures.
Central to the disparate adaptive immune systems of archaea and bacteria are clustered regularly interspaced short palindromic repeats (CRISPR). The spacer regions derive from invading genetic elements and, via RNA intermediates and associated proteins, target and cleave nucleic acids of the invader. Here we demonstrate the hyperactive uptake of hundreds of unique spacers within CRISPR loci associated with type I and IIIB immune systems of a hyperthermophilic archaeon. Infection with an environmental virus mixture resulted in the exclusive uptake of protospacers from a co-infecting putative conjugative plasmid. Spacer uptake occurred by two distinct mechanisms in only one of two CRISPR loci subfamilies present. In two loci, insertions, often multiple, occurred adjacent to the leader while in a third locus single spacers were incorporated throughout the array. Protospacer DNAs were excised from the invading genetic element immediately after CCN motifs, on either strand, with the secondary cut apparently produced by a ruler mechanism. Over a 10-week period, there was a gradual decrease in the number of wild-type cells present in the culture but the virus and putative conjugative plasmid were still propagating. The results underline the complex dynamics of CRISPR-based immune systems within a population infected with genetic elements.
Well-studied innate immune systems exist throughout bacteria and archaea, but a more recently discovered genomic locus may offer prokaryotes surprising immunological adaptability. Mediated by a cassette-like genomic locus termed Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), the microbial adaptive immune system differs from its eukaryotic immune analogues by incorporating new immunities unidirectionally. CRISPR thus stores genomically recoverable timelines of virus-host coevolution in natural organisms refractory to laboratory cultivation. Here we combined a population genetic mathematical model of CRISPR-virus coevolution with six years of metagenomic sequencing to link the recoverable genomic dynamics of CRISPR loci to the unknown population dynamics of virus and host in natural communities. Metagenomic reconstructions in an acid-mine drainage system document CRISPR loci conserving ancestral immune elements to the base-pair across thousands of microbial generations. This ‘trailer-end conservation’ occurs despite rapid viral mutation and despite rapid prokaryotic genomic deletion. The trailer-ends of many reconstructed CRISPR loci are also largely identical across a population. ‘Trailer-end clonality’ occurs despite predictions of host immunological diversity due to negative frequency dependent selection (kill the winner dynamics). Statistical clustering and model simulations explain this lack of diversity by capturing rapid selective sweeps by highly immune CRISPR lineages. Potentially explaining ‘trailer-end conservation,’ we record the first example of a viral bloom overwhelming a CRISPR system. The polyclonal viruses bloom even though they share sequences previously targeted by host CRISPR loci. Simulations show how increasing random genomic deletions in CRISPR loci purges immunological controls on long-lived viral sequences, allowing polyclonal viruses to bloom and depressing host fitness. Our results thus link documented patterns of genomic conservation in CRISPR loci to an evolutionary advantage against persistent viruses. By maintaining old immunities, selection may be tuning CRISPR-mediated immunity against viruses reemerging from lysogeny or migration.
Most microbes appear unculturable in the laboratory, limiting our knowledge of how virus and prokaryotic host evolve in natural systems. However, a genomic locus found in many prokaryotes, CRISPR, may offer cultivation-independent probes of virus-microbe coevolution. Utilizing nearby genes, CRISPR can serially incorporate short viral and plasmid sequences. These sequences bind and cleave cognate regions in subsequent viral and plasmid insertions, conferring adaptive anti-viral and anti-plasmid immunity. By incorporating sequences undirectionally, CRISPR also provides timelines of virus-prokaryote coevolution. Yet, CRISPR only incorporates 30–80 base-pair viral sequences, leaving incomplete coevolutionary recordings. To reconstruct the missing coevolutionary dynamics shaping natural CRISPRs, we combined metagenomic reconstructions with population-scale mathematical modeling. Capturing rare and rapid sweeps of CRISPR diversity by highly immune lines, mathematical modeling explains why naturally reconstructed CRISPR loci are often largely identical across a population. Both model and experiment further document surprising proliferations of old viral sequences against which hosts had preexisting CRISPR immunity. Due to these deadly blooms of ancestral viral elements, CRISPR's conservation of old immune sequences appears to confer a selective advantage. This may explain the striking immunological memory documented in CRISPR loci, which occurs despite rapid viral mutation and despite rapid deletions in prokaryotic genomes.
Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel type of direct repeat found in a wide range of bacteria and archaea. CRISPRs are beginning to attract attention because of their proposed mechanism; that is, defending their hosts against invading extrachromosomal elements such as viruses. Existing repeat detection tools do a poor job of identifying CRISPRs due to the presence of unique spacer sequences separating the repeats. In this study, a new tool, CRT, is introduced that rapidly and accurately identifies CRISPRs in large DNA strings, such as genomes and metagenomes.
CRT was compared to CRISPR detection tools, Patscan and Pilercr. In terms of correctness, CRT was shown to be very reliable, demonstrating significant improvements over Patscan for measures precision, recall and quality. When compared to Pilercr, CRT showed improved performance for recall and quality. In terms of speed, CRT proved to be a huge improvement over Patscan. Both CRT and Pilercr were comparable in speed, however CRT was faster for genomes containing large numbers of repeats.
In this paper a new tool was introduced for the automatic detection of CRISPR elements. This tool, CRT, showed some important improvements over current techniques for CRISPR identification. CRT's approach to detecting repetitive sequences is straightforward. It uses a simple sequential scan of a DNA sequence and detects repeats directly without any major conversion or preprocessing of the input. This leads to a program that is easy to describe and understand; yet it is very accurate, fast and memory efficient, being O(n) in space and O(nm/l) in time.
All immune systems must distinguish self from non-self to repel invaders without inducing autoimmunity. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci protect bacteria and archaea from invasion by phage and plasmid DNA through a genetic interference pathway1–9. CRISPR loci are present in ~ 40% and ~90% of sequenced bacterial and archaeal genomes respectively10 and evolve rapidly, acquiring new spacer sequences to adapt to highly dynamic viral populations1, 11–13. Immunity requires a sequence match between the invasive DNA and the spacers that lie between CRISPR repeats1–9. Each cluster is genetically linked to a subset of the cas (CRISPR-associated) genes14–16 that collectively encode >40 families of proteins involved in adaptation and interference. CRISPR loci encode small CRISPR RNAs (crRNAs) that contain a full spacer flanked by partial repeat sequences2, 17–19. CrRNA spacers are thought to identify targets by direct Watson-Crick pairing with invasive “protospacer” DNA2, 3, but how they avoid targeting the spacer DNA within the encoding CRISPR locus itself is unknown. Here we have defined the mechanism of CRISPR self/non-self discrimination. In Staphylococcus epidermidis, target/crRNA mismatches at specific positions outside of the spacer sequence license foreign DNA for interference, whereas extended pairing between crRNA and CRISPR DNA repeats prevents autoimmunity. Hence, this CRISPR system uses the base-pairing potential of crRNAs not only to specify a target but also to spare the bacterial chromosome from interference. Differential complementarity outside of the spacer sequence is a built-in feature of all CRISPR systems, suggesting that this mechanism is a broadly applicable solution to the self/non-self dilemma that confronts all immune pathways.