A major step forward in understanding the effector phase of the pathway came with the discovery of processed RNAs from the locus, termed crRNAs (CRISPR RNAs) or psiRNAs (prokaryotic silencing RNAs) (). Transcription of the CRISPR repeats initiates in or near the leader sequence and generates a long pre-crRNA precursor that can span the entire locus (Lillestol et al., 2006
; Lillestol et al., 2009
). The pre-crRNA is then endonucleolytically processed into fragments corresponding to the interval between repeats, producing mature products and a laddering pattern of intermediates (Brouns et al., 2008
; Carte et al., 2008
; Tang et al., 2002
; Tang et al., 2005
Processing of CRISPR content into crRNAs
coli, a complex termed Cascade produces 57nt units from the multimeric precursor transcript by cleavage within the repeat sequence (Brouns et al., 2008
). Cascade is comprised of cse1, cse2, cse4, cas5e, and cse3 (also known as CasA-CasE). Within the complex, Cse3/casE (c
coli 3) is necessary and sufficient to define the 5′ end of the product. At least two nucleotides are removed from the 3′ end of cse3 products by unknown mechanisms. Remarkably, the orthologous sequence-specific cleavage activity in P.
furiosus is carried out by a different protein, cas6 (Carte et al., 2008
). This protein has no homolog within the E.
coli cas operon subtype that includes cse3 (Haft et al., 2005
). The product of cse3 or cas6 is an RNA consisting of an 8nt repeat sequence “tag” followed by the spacer sequence, followed by the next partial repeat ().
In P. furiosus, an additional processing step was characterized that produces two discrete species of mature psiRNA, 38-45nt and 43-46nt, depending on the spacer length (Hale et al., 2008
). This final step is presumed to be exonucleolytic. The resulting RNAs maintain the 5′ repeat tag, but lose the downstream repeat-derived sequence () (Carte et al., 2008
; Hale et al., 2008
). In E. coli, potentially similar, shorter species can also be seen on northern blots in addition to the prominent 57mers (see , (Brouns et al., 2008
)), but these have not been discussed in the literature. In S.
acidocaldarius, CRISPR-derived small RNAs appear as products from 35 to 52 nt, presumably generated by endonucleolytic cleavage of long precursors (Lillestol et al., 2006
). Thus, at present, the maturation to a 35-46nt RNA appears to be a conserved processing feature. An examination of the ribonucleoprotein complexes (RNPs) that are assembled on the RNAs revealed that the precursor and mature crRNAs are found in distinct RNPs, providing the first details of the processing/assembly pathway (Hale et al., 2008
The structures of T.
thermophilus cse3 and P.
furiosus cas6 explain their common endonucleolytic function. These proteins display similar architectures, despite their lack of sequence homology (Carte et al., 2008
; Ebihara et al., 2006
; van der Oost et al., 2009
). Both enzymes are composed of a duplicated ferredoxin fold, a common domain topology that also underlies the well-known RNA-recognition motif (RRM) domain. However, the conserved sequence signatures of RRMs are absent in cse3 and cas6. The two proteins contain a spatially conserved active site with an essential histidine residue and a G-rich loop (van der Oost et al., 2009
). The crystal structure of T.
thermophilus cse2/casB, another component of the Cascade complex, reveals a novel α-helical fold with a conserved basic patch that may be involved in binding RNA (Agari et al., 2008
Accumulating evidence supports a model in which processed crRNAs serve as sequence-specific guides during the effector stage of resistance against invading elements. This was demonstrated by reconstitution of a functioning CRISPR system in E.
coli BL21(DE3), which lacks endogenous cas
genes (Brouns et al., 2008
). These cells were engineered to express the Cascade complex, as well as Cas3 and a modified CRISPR locus in which spacer sequences targeting phage lambda had been incorporated. This was sufficient to create de novo
resistance to the phage and allowed an exploration of the properties of cas proteins that were important for mounting an effective response. The catalytic activity of the cse3 nuclease within Cascade proved essential, indicating a crucial role for crRNAs in the overall defense pathway. Cas3 is not required for the generation of crRNAs, as described above, but was necessary for phage resistance in this system. This fact, along with a consideration of the domain structure of cas3, has led to the proposal that it might catalyze crRNA-guided destruction of foreign nucleic acids. Cas3 has an HD nuclease domain fused to a DExD/H helicase module (Makarova et al., 2002
). The two domains also exist as separate proteins in the CRISPR loci of some species, indicating some flexibility in this arrangement (Makarova et al., 2002
). Interestingly, one such stand-alone HD domain in S.
solfataricus was demonstrated to possess nucleolytic activity, being able to use either dsDNA or dsRNA as a substrate (Han and Krauss, 2009
). Like Cas1, the only remaining gene in the E.
coli CRISPR locus, cas2, was not required for the effector phase, implicating this gene in some other aspect of the response.
Obvious analogies to eukaryotic RNAi-related pathways provoked an initial model in which a crRNA-guided complex would target mRNAs derived from the invader (Makarova et al., 2006
). However, multiple lines of evidence point to the direct recognition of foreign DNA by the core CRISPR machinery. To date, sequence analyses have only detected spacers from phage with DNA genomes (Mojica et al., 2009
; Wiedenheft et al., 2009
). However, any conclusions based upon this observation must be tempered by the relative scarcity of RNA phage sequences. Detailed analyses in S.
thermophilus (Bolotin et al., 2005
), and more broadly in bacteria and archaea (Makarova et al., 2006
; Shah et al., 2009
), show that spacers encode crRNAs corresponding to both the coding and template strands of the phage, without a preference for any particular region. Similar conclusions can be reached by examination of spacers arising in experimentally induced phage-resistant mutants of S.
thermophilus (Barrangou et al., 2007
). Here, some bias toward the coding strand was observed, but this may be explained by the higher occurrence of the proto-spacer adjacent motif on that strand (Deveau et al., 2008
). Direct support for DNA rather than mRNA targeting comes from E.
coli, where the use of engineered spacers demonstrated that effective crRNAs could be produced from either the coding or template strand of lambda phage (Brouns et al., 2008
Additional support for DNA targeting comes from a recent study of CRISPR activity in a clinical isolate of S.
epidermidis, RP62a (Marraffini and Sontheimer, 2008
). Here, the CRISPR locus contains a spacer against the nickase gene of staphylococcal conjugative plasmids. Since nickase activity is required for conjugation only in donor cells, targeting of its mRNA would ablate RP62a's function as a donor but not as a recipient. The spacer negated both donor and recipient activity, as predicted by a DNA targeting model. Insertion of the proto-spacer into a non-conjugative plasmid prevented that plasmid from being transformed into RP62a, demonstrating that resistance was not linked to the mode of plasmid entry. The DNA-targeting model was supported by the observation that the targeted region was equally effective in either orientation within the plasmid. As an additional test of the model, Marraffini and Sontheimer cleverly designed a variant of the plasmid in which the nickase proto-spacer was interrupted by a self-splicing intron. This split the targeted sequence in the plasmid but reformed it in the encoded mRNA. This construct was capable of conjugation into RP62a, indicating that the CRISPR defense was circumvented when the DNA target was disrupted, but the mRNA target remained as a potential substrate.
The above evidence notwithstanding, very recent results demonstrate a capacity for RNA targeting in CRISPR systems containing the RAMP module. In P. furiosus
(), Hale and colleagues identified the six RAMP module proteins in a ribonucleoprotein complex containing the mature 39nt and 45nt psiRNAs with the shared 8nt 5′ repeat tag (Hale et al., 2009
). Remarkably, the complex possessed endonucleolytic activity toward RNA targets with sequence complementarity to endogenous psiRNAs. The same activity was shown in a reconstituted, recombinant complex programmed by either 39nt or 45nt psiRNAs, or both, with only cmr5 being dispensable for the cleavage. The cleavage site is positioned 14nt from the 3′ end of the psiRNA, leading to different products for the 39nt and 45nt guides. While the exact nuclease within the complex is not known, the activity leaves a 3′ or 2′-3′ cyclic phosphate and requires divalent ions. Thus, for CRISPR systems that encode a RAMP module, the effector stage appears to include a mode of targeting the RNA components of the phage's life cycle. Alternatively to targeting phage, the RAMP module of the CRISPR system may be co-opted to impact endogenous cellular processes. Further investigation into the in vivo
function of this mode will undoubtedly provide intriguing insights into the question.
It remains to be clarified how the newly discovered RNA cleavage activity relates to DNA targeting. The E. coli
, S. thermophilus
and S. epidermidis
systems that were used to demonstrate DNA targeting do not possess a RAMP module (Haft et al., 2005
). However, the RAMP-module-containing CRISPR systems, such as that of P. furiosus
, may still retain an ability to affect the phage DNA directly. Two such systems in S. solfataricus
and B. halodurans
(Haft et al., 2005
) contain spacers that are both sense and antisense to extrachromosomal elements, suggesting that DNA targeting is active in these organisms (Makarova et al., 2006