|Home | About | Journals | Submit | Contact Us | Français|
The 5′-untranslated regions (5′-UTRs) of all gammaretroviruses contain a conserved “double hairpin motif” (ΨCD) that is required for genome packaging. Both hairpins (SL-C and SL-D) contain GACG tetraloops that, in isolated RNAs, are capable of forming “kissing” interactions stabilized by two intermolecular G-C base pairs. We have determined the three-dimensional structure of the double hairpin from the Moloney Murine Leukemia Virus (MoMuLV) ([ΨCD]2, 132-nucleotides, 42.8 kDaltons) using a 2H-edited NMR spectroscopy-based approach. This approach enabled the detection of 1H-1H dipolar interactions that were not observed in previous studies of isolated SL-C and SL-D hairpin RNAs using traditional 1H-1H correlated and 1H-13C-edited NMR methods. The hairpins participate in intermolecular cross-kissing interactions (SL-C to SL-D’ and SLC’ to SL-D), and stack in an end-to-end manner (SL-C to SL-D and SL-C’ to SL-D’) that gives rise to an elongated overall shape (ca. 95 Å by 45 Å by 25 Å). The global structure was confirmed by cryo-electron tomography (cryo-ET), making [ΨCD]2 simultaneously the smallest RNA to be structurally characterized to date by cryo-ET and among the largest to be determined by NMR. Our findings suggest that, in addition to promoting dimerization, [ΨCD]2 functions as a scaffold that helps initiate virus assembly by exposing a cluster of conserved UCUG elements for binding to the cognate nucleocapsid domains of assembling viral Gag proteins.
Considerable effort has been made over the past three decades to understand the mechanisms that retroviruses use to selectively package their RNA genomes.1-8 All retroviruses contain two positive-strand RNA genomes that are encapsidated within the central core of the virus.9 Both copies of the genome are required for replication, allowing strand-transfer to occur at strand breaks during reverse transcription and promoting genetic diversity through recombination in heterozygous particles.10-13 Genome selection is mediated by packaging elements, called Ψ-sites, which are typically located within the 5′-untranslated region (5′-UTR) of the RNA.1,2,9,14,15 Elements that promote RNA packaging generally overlap with those that promote dimerization,4,16 and both processes are mediated by the nucleocapsid domain (NC) of the retroviral Gag protein. The RNA exists as a non-covalently linked dimer in mature particles, and although small amounts of monomeric genomes can be isolated under mildly denaturing conditions from young virions,17 there is now considerable genetic evidence that genomes are selected for packaging as dimers, supporting early proposals that dimerization and packaging are mechanistically coupled.5,7,18-24
Much of what is currently known about genome packaging has been derived from studies of the Moloney Murine Leukemia Virus (MoMuLV). MoMuLV is a gammaretrovirus that has been widely used as a vector for gene delivery2 and is closely related to Xenotropic murine leukemia virus (MLV)-related virus (XMRV), a potential human pathogen25 with links to aggressive forms of prostate cancer26 and Chronic Fatigue Syndrome.27 Although both disease links have been questioned,28,29 very recent studies suggest the CFS may be caused by a genetically diverse group of MLV-related viruses30. Early nucleotide accessibility mapping, phylogenetic analyses and free energy calculations indicated that the MoMuLV 5′-UTR consists of a series of closely spaced hairpins, and that the RNA undergoes changes in secondary structure upon dimerization.31-34 Mutagenesis studies revealed that dimerization and packaging are promoted by four stem loops, two that are capable of forming intermolecular duplexes (DIS1 and DIS2)16,31,33,35-38 and two that can form “kissing interactions” mediated by base pairing between residues within their conserved GACG tetraloops (SLC and SLD).39 The latter hairpins are essential for genome packaging and have been proposed to function as “double hairpin motif” (ΨCD).40
We recently presented evidence that register shifts in the base pairing of DIS-1 and DIS-2 that occur upon dimerization expose conserved UCUG elements that are capable of binding NC with high affinity, Figure 1.41 The UCUG elements are sequestered by base pairing in the monomeric RNAs and unable to bind NC. Although these studies suggest well-defined roles of the DIS-1 and DIS-2 hairpins in regulating exposure of NC binding sites, the mechanistic role of the ΨCD double hairpin remains less well understood. Deletion of ΨCD can reduce vector RNA packaging to ~1% of wild-type levels, and introduction of a segment containing ΨCD and some adjacent residues (A285-C419) into a non-packaged vector RNA can promote packaging to nearly wild-type levels.42 However, isolated SL-C, SL-D, and ΨCD RNAs do not exhibit significant affinity for NC.43 In addition, although the GACG tetraloops are capable of forming kissing dimer contacts in small RNAs, they do not appear to contribute significantly to the monomer-dimer equilibrium measured for the intact 5′-UTR (although they enhance the rate of equilibration in vitro).31 Structures have been determined by NMR for the kissing dimer form of the isolated SL-D hairpin,39 as well as for larger RNAs that contain loop mutations that prevent kissing interactions,41,44 and site-directed hydroxyl radical cleavage and ribose reactivity (SHAPE) experiments have been used to derive a model for the tandem hairpin of the closely related Moloney Murine Sarcoma Virus (MoMuSV, which differs from the MoMuLV tandem hairpin by a single G338/U substitution).45 The latter studies suggested that the MuSV [ΨCD]2 RNA forms a highly stable, globular structure, in which the GACG tetraloops of SL-C and SL-D participate in cross-kissing (SL-C to SL-D’ and SL-C’ to SL-D) interactions.46
To better understand the role of ΨCD in dimerization and packaging, we have characterized the solution-state equilibrium properties of the tandem hairpin in the context of the intact core encapsidation signal (ΨCES), which includes DIS-2, SL-C and SL-D, and have determined the structure of the dimeric MoMuLV tandem hairpin ([ΨCD]2) using a 2H-edited NMR approach. The NMR data confirmed the cross-kissing interactions predicted by chemical probing and mutagenesis.46 However, local and global structural features, and particularly the elongated overall shape of the NMR structure, differ significantly from the more globular SHAPE-derived model. We therefore independently assessed the overall shape of [ΨCD]2 by cryo-electron tomography (cryo-ET). Both the individual RNA densities in the unfiltered tomograms and the final cryo-ET average density map were consistent with the elongated structure determined by NMR. The combination of high-resolution local structural interactions derived by 2H-edited 2D NOESY NMR with global structural shape information derived by cryo-ET could serve as a general approach for structural studies of larger RNAs, particularly those that adopt multiple equilibrium conformations in solution. Our findings are consistent with a packaging model in which the tandem hairpin functions as a scaffold that, upon dimerization, helps expose of a cluster of high-affinity Gag binding sites.
The conformational behavior of the double hairpin motif was initially characterized in the context of a larger ¬CES construct that included residues of DIS-2, Fig. 1. Previous studies showed that ΨCES exists predominantly as a monomer at low RNA concentrations and low ionic strength, and that the DIS-2 hairpin adopts two slowly inter-converting conformers,41 Fig. 1b. We now show that the non-kissing form of SL-C also exists in two slowly-interconverting conformations, one in which conserved residues G229-G332 form a tetraloop and G338 to A341 form a bulge, an another in which several of these residues form Watson-Crick base pairs, Fig. 1c. Both of these structures (and only these structures) are predicted to occur based on free energy calculations with MFOLD.47 Analysis of 2D NOESY spectra obtained for an isolated SL-C RNA confirmed the presence of the predicted structures (Supplementary Figure S1), and NMR spectra obtained for native ΨCES exhibited exchange cross peaks consistent with those observed for SL-C and indicative of the conformational equilibrium shown in Fig. 1. The exchange peaks persisted in NMR spectra obtained for ¬CES under physiological concentrations of salts that favor dimerization (140 mM KCl, 10 mM NaCl, 1 mM MgCl2), even though the intensities of the resolved diagonal peaks indicated that the population of non-kissing conformer is low (< 5%).
MFOLD calculations indicated that substitution of the U328-A333 base pair preceding the SL-C GACG tetraloop by an A-U base pair would prevent formation of the minor, non-kissing conformer. Interestingly, a comparison of the nucleotide sequences of the gammaretroviruses revealed that this substitution occurs naturally. We therefore prepared RNA constructs in which these residues were swapped. As predicted, these conservative substitutions (U328A/A333U) precluded formation of the alternate, non-kissing SL-C conformer and gave rise to NMR spectra that lacked the associated conformational exchange peaks (Figure 2B and Supplementary Figure S1). Therefore, in order to reduce crowding in the NMR spectra and simplify analyses, subsequent NMR studies were conducted with RNAs containing the U328A/A333U substitutions, Figure 2c.
To ensure that the conservative U328A/A333U substitution did not have unexpected effects on RNA packaging in vivo, RNA encapsidation efficiencies were measured for RNAs containing mutant and wild type leader sequences. Virions were produced by transient transfection of human 293T cells, and quantified with values for wild type particles set to 100%. Previous work has demonstrated that virion quantification by virion protein (eg: reverse transcriptase activity48) or by host 7SL RNA yields indistinguishable values.49,50 7SL is a host RNA that is incorporated into retroviruses in proportion to virion proteins, in a manner independent of viral genomic RNA.49-51 Therefore, the amount of 7SL RNA in a cell-free virus sample is directly proportional to virion proteins and can be used to normalize for the amount of virus in that sample. The RNAse protection assay that was used to quantify viral RNA packaging is shown in Figure 2d,e. Figure 2d shows the riboprobe used, which allowed simultaneous quantification of encapsidated 7SL and virion genomic RNA. RNase digestion products are shown in Figure 2e. Analysis of RNA isolated from virions produced by transient transfection revealed that the U328A/A333U mutant RNA was packaged at levels comparable to wild type genomic RNA, while a mutant from which regions classically defined as sufficient to promote packaging of a heterologous RNA8 were deleted was packaged more than 200-fold less well than either wild type or the U328A/A333U mutant.
NMR spectra obtained for ¬CES under conditions that favor dimerization exhibit signals diagnostic of kissing interactions involving the GACG tetraloops of SL-C and SL-D. Unfortunately, the chemical shift differences between SL-C and SL-D loop residues are very small, and it was not possible to unambiguously differentiate between SL-C to SL-C, SL-D to SL-D, and SL-C to SL-D kissing modes on the basis of the 2D NOESY spectra obtained for fully protonated, or nucleotide-specifically protonated/perdeuterated [¬CES]2 RNAs. We therefore obtained 2D NOESY data for a ¬CES sample prepared by segmentally ligating differentially deuterated fragments using T4 RNA ligase. Specifically, we ligated an SL-BC fragment, in which the guanosines were fully protonated and all other nucleotides were perdeuterated (GH-SL-BC), with an SL-D fragment that contained protonated adenosines and deuterated G, C and U nucleotides (AH-SL-D), Figure 3a,b. The AH-SL-D fragment was prepared using a plasmid that encoded the hammerhead and HDV ribozymes at the 5′- and 3′-ends of the RNA, respectively, in order to obtain homogeneous products with adenosine and 2′-3′-cyclic phosphate groups at the 5′- and 3′-termini, respectively.52 After ribozyme cleavage, the SL-D fragment was treated with polynucleotide kinase to produce the desired 5′-monophosphorylated (donor) terminus, and was used in the ligation reaction with excess amounts of the SL-BC fragment. Efficient ligation was achieved without the use of stints, with typical yields (based on the limiting SL-D fragment) of ~85%, Fig. 3b.
2D NOESY spectra obtained for the [SL-C]2 kissing dimer exhibited spectral features similar to those reported previously for [SL-D]2. In particular, the H4′ proton of G334 (corresponding to C368 of SL-C) gave rise to an unusual upfield-shifted NMR signal (2.8 ppm) and exhibited an inter-molecular NOE with the aromatic H2 proton of A330, Fig. 3c. A similar pattern of NOEs was observed in the 2D NOESY spectrum obtained for the fully protonated [ΨCES]2 sample (data not shown). Although these data confirmed that SL-C and SL-D both participate in kissing interactions, they did not allow unambiguous differentiation between SL-C:SL-C/SL-D:SL-D and SL-C:SL-D kissing modes. However, the observation of a A364-H2 to G334-H4′ NOE cross peak in the spectrum obtained for the segmentally labeled [ΨCES]2 sample (Fig. 3d) provides clear evidence that the kissing interface is formed by the loop residues of SL-C and SL-D.
To date, only a small handful of structures have been reported for RNAs comprising more than 50 nucleotides.53 We therefore focused our initial structural studies on the 132-nucleotide [ΨCD]2 dimer. A major impediment to NMR studies of larger RNAs is severe signal degeneracy resulting from the presence of only four different common nucleotides. Although spectral resolution of smaller RNAs can be increased through the use of multi-dimensional 13C-edited NMR experiments, this approach is problematic when applied to larger RNAs due to signal losses and broadening resulting from strong 1H-13C dipolar coupling of aromatic C-H groups that are critical for assignment and structure determination.54 We therefore employed a strategy that involves the collection and analysis of 2D NOESY spectra for RNA samples prepared by in vitro transcription using different combinations of fully protonated, fully deuterated, and partially deuterated nucleotides. This approach was intended to avoid sensitivity and resolution problems associated with 13C labeling. Throughout this paper, we use a nomenclature in which only the proton-containing nucleotides are denoted, with superscripts indicating the positions of protons in partially deuterated nucleotides (8, 2, and R = protonated on the C-8, C-2, and ribose carbons, respectively) and a lack of superscripts denoting full protonation. For example, a sample of [ΨCD]2 that contains fully protonated guanosines, perdeuterated cytidines and uridines, and adenosines with deuterons on the ribose and aromatic C-2 carbons and a proton on the aromatic C-8 carbon, is denoted G,A8-[ΨCD]2. Differentially deuterated samples utilized in these studies included: A,G8-, G,A8-, U,G8,A8-, C,G8,A8-, G,U6-, and G,A2,R-[ΨCD]2.
As expected, the 2D NOESY spectrum obtained for fully protonated [ΨCD]2 exhibited broad, overlapping cross peaks and was not assignable, Supplementary Figure S2. However, spectra exhibiting good resolution and sensitivity were obtained for several selectively deuterated [ΨCD]2 RNAs. Portions of the 2D NOESY spectrum obtained for G,A8-[ΨCD]2 are shown in Figure 4a. In this spectrum, intra-residue NOE cross peaks for all guanosines, G-H1′ to A-H8 NOEs for all sequential G(i)-A(i+1) pairs, and most of the expected H8-to-H8 NOEs involving sequential purines were readily detected. Because all of the adenosine ribose protons were substituted by deuterium, inter-residue A(i+1)-H8 to G(i)-H2′/3′ NOEs were also readily identified. Assignments of partially overlapping H2′ and H3′ signals were confirmed by comparison with spectra obtained for the isolated RNA stem loops. Note that some of the H8-to-H8 NOEs could not be unambiguously assigned due to small chemical shift differences between the neighboring aromatic protons and/or the proximity of the associated NOE cross peaks to the intense diagonal. The spectrum also exhibited exchange cross peaks corresponding to non-kissing SL-C and SL-D species. Although the intensities of the exchange peaks were significant, intensities of the diagonal and NOE cross peaks associated with the non-kissing species were small, when measurable, and no intra- or inter-residue NOEs were detected for the non-kissing species. Based on a qualitative assessment of the signal intensities, we estimate that, under the conditions of the NMR experiments, the population of the non-kissing species is ca. 5 ± 3% of that of the kissing species.
Sequential and long-range NOEs involving the adenosine-H2 protons were assigned primarily from spectra obtained for G,A2,R-[ΨCD]2, Figure 4b. Intense, well resolved cross peaks were observed for all of the expected sequential and cross-strand A-H2 to G/A-H1′ proton pairs, in addition to the expected A(i)-H1′ to G(i+1)-H8 NOEs, Figure 4b. Notably, residues A353 and A354 of the linker that connects SL-C and SL-D exhibited sequential inter-residue aromatic-to-ribose proton NOEs consistent with A-form helical stacking, with A353-H2 exhibiting intense NOEs with the H1′ and H8 protons of A354 and G310, and with A354-H2 exhibiting NOEs with the G309-H1′ and H8 and C355-H6 protons (observed in the A2R,G8C6-[ΨCD]2 spectra). These data indicate that SL-C and SL-D are arranged in an end-to-end manner, in which the 3′-residue of SL-C, the linker adenosines, and the 5′-residues of SL-D are stacked in an extended, A-like geometry.
NMR signals observed for pyrimidine H6 and H5 protons in spectra obtained for [ΨCD]2 samples containing fully protonated cytosine or uracil residues were significantly broader than those associated with the purine H8 protons. For example, as shown in Figure 4c, 1H NMR linewidths associated with U-H6 protons in 2D NOESY spectra obtained for U,A8,G8-[ΨCD]2 were typically more than three-times greater than those associated with the G-H8 and A-H8 protons (the exception being U319, which forms an unstructured bulge; see below). Several purine H8-to-uracil proton NOEs could be assigned from these spectra, mainly because ΨCD contains only seven uracil residues. Analogous spectra obtained for ΨCD RNAs containing fully protonated cytosines were largely not assignable due to severe signal overlap (data not shown). To eliminate the relatively strong H5-to-H6 dipolar coupling that appears to be primarily responsible for the broader NMR signals, ΨCD samples were prepared using pyrimidine nucleotides containing a proton on the C6 carbon and deuterons on all other carbons. The quality of the NMR spectra obtained for these samples was significantly improved, enabling assignment of all expected purine-H1′ to pyrimidine-H6′ proton connectivities and most purine-H8 to pyrimidine-H6 connectivities. Representative spectra and assignments made for G,U6,A8-[ΨCD]2 are shown in Figure 4d.
The NMR spectra obtained for the highly deuterated samples also exhibited cross-strand NOEs (i.e., between protons on two different strands of a given helix) for protons that are likely to be separated by more than 5.0 Å. For example, in the NMR spectrum obtained for A,G8-[ΨCD]2, relatively intense G359-H8 to A371-H2, G323-H8 to A343-H2, and G334-H8 to A328-H2 NOEs were readily detected (Supplementary Figure S3). Analogous cross peaks observed in spectra obtained for G,A2,R-[ΨCES]2 (Figure 4b) were originally attributed to spin diffusion, due to the close proximity of the A-H2 and G-H1′ protons. Since the G-H1′ protons are substituted by deuterium in the A,G8-labeled sample, cross-strand A-H2 to G-H8 NOEs observed in these spectra can only be attributed to direct, long-range dipolar interactions.
Using the above approach, 100% of the non-exchangeable aromatic and H1′ signals were assigned. Signal overlap precluded independent assignment of a majority of the H2′ and H3′ signals, but by comparisons with higher resolution 2D NOESY and 2D 1H-13C correlated NMR spectra obtained for isolated SL-C and SL-D hairpins, signals for > 90% of the purine H2′ and H3′ protons and > 70% of the pyrimidine H2′ and H3′ protons of [ΨCD]2 could be identified.
The improved resolution provided by the partially deuterated samples enabled us to correct a previously misassigned NOE associated with A340. Specifically, NMR spectra obtained for C,G8,A8-[ΨCD]2 revealed that a A340-H8 to C342-H1′ cross peak, which was previously attributed to spin diffusion via the intervening A341-H1′ and/or –H2 protons,44 was actually due to a direct dipole-dipole interaction. The A340-H8 proton also exhibited an intense intra-residue H8-to-H1′ NOE that was obscured by signal overlap in previously obtained NMR spectra, indicating that this residue adopts a syn conformation. As described below, these assignment corrections led to a change in the orientation of A340, but did not significantly affect other residues of the GGAA bulge in structures calculated for [ΨCD]2.
The 1H NMR chemical shifts measured for the stem loops of [ΨCD]2 were nearly indistinguishable from those observed for the isolated SL-C and SL-D hairpin RNAs, indicating that the structures should also be similar. The 1H NMR chemical shifts of the adenosine residues that link SL-C with SL-D differed by 0.02-0.83 ppm (Δδ) relative to the shifts observed previously for a 101-nucleotide RNA containing mutations in the loops of SL-B, SL-C and SL-D that prevented dimerization (Δδ for the H8, H2 and H1′ protons of 0.052, 0.271, 0.019 ppm for A353 and 0.325, 0.831 and 0.455 ppm for A354). This construct (SL-BmCmDm-UU) contained two additional non-native 3′-uridines that formed base pairs with A353 and A354 of the linker, and this is likely responsible for the observed chemical shift differences. Interestingly, the NOE cross peak patterns observed for A353 and A354 of both [ΨCD]2 and SL-BmCmDm-UU are consistent with A-form helical stacking. Thus, relatively intense NOEs were observed between A353-H2 and protons on the following (A354-H1′, -H2, and –H8) and cross-strand (G310-H1′) residues, and between A354-H2 and the H1′ proton on the following residue (C355). In addition, residues C352 through C355 exhibited sequential ribose(i) to H8/6(i+1) and H8/6(i) to H5(i+1) NOEs consistent with a structure in which residues C352 through C355 stack in a continuous, A-form like manner.
A total of 1248 unique and functionally non-redundant NOE-derived 1H-1H distance restraints were obtained from the 2H-edited NOESY spectra, Table 1. Hydrogen bond restraints and torsion angle restraints were employed as flat-well potentials with values centered at those observed in idealized A-form helices for segments exhibiting NOE cross peak patterns and intensities consistent with A-form helical conformations (residues G310-G313, C316-G318, G320-G323, G324-A328, U333-C337, C342-C352 of SL-C and C355-A362, U367-G374 of SL-D). Major groove inter-phosphate distances were loosely restrained for these residues using database potentials derived from high-resolution X-ray crystal structures.53 These restraints were required to avoid collapse of the major groove that can result from the asymmetric distribution of distance restraints in A-form helical segments.44,53,55,56 Inter-molecular cross-kissing H-bond restraints consistent with the NOE results obtained for the segmentally-labeled sample were employed (see above), but no intra-molecular H-bond or torsion angle restraints were employed for the residues of any of the loops or bulges. Because the NOEs associated with A353 and A354 are consistent with an A-form helical stacking, loose inter-phosphate distance and torsion angle restraints (ideal ± 50°) were employed for these residues as well. In addition, weak restraints were employed for the phosphorus, ribose and aromatic carbon atoms to enforce symmetry between the two molecules of the dimer.
An ensemble of 20 structures with lowest target function (4.11 ± 0.05 Å2) was obtained from an initial pool of 160 structures generated using Cyana.57 The structure is well defined by the NMR data, with best-fit superpositions of all heavy atoms affording pairwise RMS deviations (relative to mean atomic coordinates) of 0.40 ± 0.12 Å, Table 1 and Figure 5a,b. Residues G309 and U319 (and the corresponding residues in the symmetrical dimer) were not experimentally restrained and therefore exhibit poorer convergence. These residues give rise to intense and relatively narrow 1H NMR signals and are therefore likely to be disordered. Statistical information regarding the restraints employed, restraint violations, and the structure convergence is provided in Table 1.
Many features of the [ΨCD]2 structure are consistent with those observed in NMR structures calculated previously for native and mutant fragments of ΨCES.39,43,44 Thus, residues G310-G323 adopt an A-form helical “lower stem” in which unpaired residues A314 and C315 are internally stacked, U319 forms an unstructured extra-helical bulge, and residues G338-A341 form an A-minor K-turn type structure that connects the lower stem with the upper stem (G324-C337). As observed previously for a mutant ΨCES RNA (in which residues of the GACG tetraloop were mutated to prevent dimerization44), the nucleobase of G338 stacks against C337 in an A-like manner, G339 adopts a syn conformation and packs against the nucleobase of A340, and A341 packs against the nucleobase of C342 and forms an A-minor like interaction with A324 and A325 of the upper stem, Figure 5c. In addition, we now have evidence that A340 adopts a syn conformation, and as indicated above, NOE spectra obtained for the C,G8,A8-[ΨCD]2 sample provides clear evidence for direct A340-H8 to C342-H1′/–H4′ dipolar interactions (which were originally erroneously attributed to spin diffusion via A341). As such, the orientation of A340 in the [ΨCD]2 structure differs from that reported earlier, and is now shown to pack against the nucleobase of G339 and the nucleobase and ribose moieties of A341 and C342, respectively (Fig. 5c). The structures and interactions of the two symmetrical kissing interfaces (SL-C to SL-D’ and SL-C’ to SL-D) are essentially indistinguishable from those observed previously for a dimeric [SL-D]2 kissing hairpin.39
A surprising feature of the [ΨCD]2 structure is that the lower stems of SL-C and SL-D are stacked in an “end-to-end” orientation, in which the linker residues A353 and A354 are stacked in an A- helical manner between the G310-C352 base pair of SL-C and the C355-G374 base pair of SL-D, Figure 5d. Although similar end-to-end stacking interactions were observed in the monomeric, mutant ΨCES RNA, those interactions were attributed to non-native base pairing between the linker adenosines and two 3′-uridines that were included to accommodate an HpaI restriction site used for sub-cloning.44 The [ΨCD]2 RNA used in the present studies was transcribed from a template that encoded a 3′-hammerhead ribozyme which, after cleavage, afforded a product RNA that terminated at G374 and did not contain additional native or non-native residues. Thus, in the [ΨCD]2 structure, residues C352-C355 are stacked in a continuous, A-form like helical conformation as observed previously for ΨCES, even though that A354 and A355 do not form Watson-Crick base pairs. As such, the structure of [ΨCD]2 is both elongated and flat, with overall dimensions of ca. 95 Å by 45 Å by 25 Å.
The 5′-guanosine residues do not form base pair or stacking interactions and appear disordered in the structure. This is consistent with earlier NMR studies showing that this guanosine (G309) interacts directly with the zinc finger domain of the MoMuLV NC protein. The two guanosines of the symmetrical dimer (G309 and G309′) are separated by ~20 Å in the NMR structure, Figure 5, a finding that has important implications regarding the likely structure of the preceding DIS-2 duplex (see below).
Because of the inherent paucity of long-range structural information available in the NOE data, attempts were made to establish relative helix orientations and overall molecular shape using NMR-derived residual dipolar couplings (RDCs) and small angle X-ray scattering (SAXS), respectively. NMR-derived RDCs and RCSAs have been used previously to establish global structural features,44,53,58-85 including inter-helical orientations.44,63,64,68,86-91 Unfortunately, neither of these approaches was successful. As indicated above, ΨCD exists as a monomer at low ionic strength, and under the conditions of the NMR studies, a small amount of the monomer species (or extended dimer species, in which one of the hairpins does not participate in kissing interactions) persists and is detectable in the NMR spectra. Despite the low abundance of these species, their higher apparent rotational mobilities lead to significant contributions to the RDC spectra obtained, precluding quantitative analysis of the RDC data. In addition, dynamic light scattering (DLS) and SAXS data obtained for ΨCD under conditions of the NMR experiments were consistent with the presence of a small amount of one or more higher-MW species. Unfortunately, sample heterogeneity, and particularly the presence of even small amounts of high MW contaminants, can complicate quantitative analysis of the SAXS data. We therefore carried out structural studies by cryo-electron tomography (cryo-ET).
Cryo-ET allows for the direct detection of molecular assemblies within a given sample, and for heterogeneous samples, species of similar size and/or topological properties can be classified for further analysis. A representative projection of 15 slices of unfiltered cryo-ET data obtained for [ΨCD]2 is shown in Figure 6a. Numerous punctate and elongated dark densities are clearly visible in the tomogram. In contrast, cryo-ET data obtained for a sample containing only buffer (the same buffer used to prepare the [ΨCD]2 sample) and processed under identical conditions lacked these features, Figure 6b, indicating that the distinct densities observed using the [ΨCD]2 samples are due to electron scattering by the RNA. As an additional control, cryo-ET data were also obtained for [ΨCES]2. The tomograms obtained for this larger RNA exhibited densities with volumes and topological features consistent with the side-by-side packing of the DIS-2 helix and the [ΨCD]2 tandem hairpin (Supplementary Figure S9). These findings collectively indicate that RNAs as small as [ΨCD]2 (132 residues) are amenable to structural studies by cryo-ET. Previously, the smallest reported molecule studied by cryo-EM was a 78 kDA, 252 residue DNA tetrahedron. The structure of this highly symmetric and rigid molecule was determined by single particle methods with symmetry enforced.92
In addition to the compact densities discussed above, densities consistent with larger, elongated structures were observed in the [ΨCD]2 tomograms (red boxes in Figure 6c). These more extended densities adopted random overall shapes. In view of the facts that (i) the NMR spectra obtained for the liquid state samples exhibited exchange cross peaks indicative of a minor population of monomeric, non-kissing species, and (ii) the DLS and SAXS data suggest the presence of a minor population of a higher molecular weight species, we attribute the smaller and extended densities in the cryo-ET tomograms to monomers, partly dissociated dimers (i.e., dimers linked by only a single kissing structure) and higher-order multimers. The [ΨCD]2 concentration used for cryo-ET was ~ 40-fold lower than that used for NMR measurements, which may explain why the non-kissing species were more abundant in the cryo-ET data than in the NMR data.
Stereo views of an unfiltered subvolume of a representative cryo-ET tomogram showing a typical compact density, along with the relative noise level, are shown in Figure 6d (movies of the raw data are provided in Supplementary Figures S5 and S6). For comparison, a representative [ΨCD]2 NMR structure has been fitted into the density. The signal-to-noise ratio of the unfiltered cryo-ET data is clearly sufficient for alignment and averaging, and the overall dimensions of the compact densities are consistent with the dimensions of the [ΨCD]2 NMR structure, Figure 6d.
A total of 38 such subvolumes with compact densities were computationally extracted, classified, aligned and averaged (see Methods), affording an averaged cryo-ET density with approximate length of 95 Å, a width of 45 Å and a thickness of 25 Å at its narrowest point, Figure 6e. The averaged cryo-ET density map exhibits two-fold symmetry, even though no symmetry was assumed or applied in the reconstruction, alignment and averaging process, and appears to be fully consistent with the NMR ensemble, Figure 6e. Although U319 resides outside of the averaged cryo-ET density (Fig. 6e), this residue is disordered in the NMR structure and appears to undergo rapid conformational averaging (based on the unusually narrow 1H NMR lineshapes and lack of significant inter-residue NOEs; see above). Interestingly, the cryo-ET average density exhibits a slightly concave shape about the approximate two-fold axis, which also appears consistent with the NMR structures, Fig. 6e.
The application of 2D NOESY NMR spectroscopy to partially deuterated [ΨCD]2 samples served as an effective approach for assignment of the non-exchangeable aromatic and H1′ protons. Assignment of the H2′ and H3′ signals was generally more challenging due to signal overlap, and in many cases it was only possible to make assignments based on comparisons with higher resolution 2D NOESY, TOCSY, and 1H-13C correlated NMR spectra obtained for isolated SL-C and SL-D hairpins. Except for the two adenosines that link SL-C and SL-D, 1H NMR frequencies and NOE cross peak patterns and intensities observed for residues of [ΨCD]2 were nearly indistinguishable from those observed for the isolated RNA hairpins, providing confidence in the resonance assignments and indicating that the internal structures of the hairpins in [ΨCD]2 are similar to those observed for the monomeric and kissing dimer forms of the isolated hairpins.
The NMR data obtained for one of the partially deuterated [ΨCD]2 samples (C,A8,G8-[ΨCD]2) led to the identification of inter-residue dipolar interactions that had previously been attributed to spin diffusion. As such, the orientation of A340 of the GGAA bulge of the [ΨCD]2 structure reported here differs from its orientation in previously reported structures from our laboratory.41,44 Although this correction did not lead to significant changes in the structure of the SL-C hairpin or the GGAA bulge (all of the A-minor93 and K-turn94-like interactions observed in the previous studies44 were also observed here), it does illustrate the power of 2H-editing for differentiating between spin diffusion and direct dipolar interactions. The structures calculated for [ΨCD]2 exhibited remarkably good convergence, despite the paucity of long-range NOE-derived restraints (the only intermoleculear NOEs detected were for protons at the kissing interfaces). This can be attributed to the fact that the internal structures of the hairpins, bulges, linkers, and kissing interfaces were all well defined by the local structural restraints, and as such, there was only one general intermolecular arrangement that satisfied both the internal NOE data and the intermolecular cross-kissing interactions.
Attempts to confirm the overall topological features of the [ΨCD]2 NMR structure using NMR-derived residual dipolar coupling (RDC) and residual chemical shift anisotropy (RCSA) information were unsuccessful due to the fact that, under all experimental conditions employed, the [ΨCD]2 samples contained a small residual population of monomeric and/or extended multimeric RNAs with non-kissing hairpins. The favorable NMR relaxation properties of these species (due to more rapid rotational diffusion), combined with the significantly less favorable relaxation properties of the fully folded dimeric RNA, precluded quantitative measurement of RDC and RCSA values for the dimer. Attempts to obtain information on the overall shape of [ΨCD]2 by SAXS, which has been used by others for RNA structure refinement,81,95,96 were also unsuccessful due conformational heterogeneity and the presence of multimers. Fortunately, we were able to obtain global shape information for [ΨCD]2 by cryo-ET. Although the resolution of the cryo-ET data did not allow detection of individual base pairs or the precise separation of the helices, the overall shape and dimensions of the individual and averaged cryo-ET subvolume densities, as well as the symmetry of the density average, was consistent with the NMR-derived structure of [ΨCD]2. The cryo-ET data also confirmed the presence of multimers, as well as extended species (monomers and dimers) inferred from the chemical exchange cross peaks in the 2D NOESY spectra.
The MoMuLV [ΨCD]2 NMR structure differs considerably from the MuSV [ΨCD]2 structure proposed on the basis of SHAPE and site-directed hydroxyl radical cleavage analyses.46 In the NMR structure, the 5′-guanosine of a given RNA strand resides near the kissing interface formed by the SL-C loop of the same molecule, whereas in the SHAPE structure, the 5′-guanosine resides near the SL-C’ loop of the neighboring strand. Also, the stems of SL-C and SL-D of a given strand are packed in a nearly side-by-side arrangement in the SHAPE structure, but are arranged end-to-end in the NMR structure. As a consequence, the NMR structure has an elongated overall shape, consistent with the cryo-ET data, whereas the SHAPE model is more compact and has a more isotropic, globular shape. These disparities could conceivably be due to the presence of alternate RNA conformers in solution (detected by both NMR and cryo-ET), which could potentially complicate interpretation of bulk chemical reactivity results. However, we cannot rule out the possibility that differences in nucleotide sequence (the MoMuSV and MoMuLV tandem hairpins differ by a G338U substitution) or sample conditions may be responsible for the observed differences.
Cryo-ET with subvolume classification, alignment and averaging is emerging as a useful technique for studying heterogeneous samples at low resolution. Compared with spectroscopic methods that rely on bulk sample measurements for determining molecular shape, cryo-ET appears superior for RNAs like ΨCD that adopt multiple equilibrium conformations. The low resolution global structural information provided by cryo-ET is complementary to the high resolution local structural information obtainable by NMR, making this an attractive combination for structural studies of modest sized (~150 nucleotide) RNAs.
Mutagenesis and packaging studies have provided compelling evidence that ΨCES is both critical for genome packaging40,42,97-99 and capable of functioning as an autonomous packaging signal.42 The DIS-2 hairpin of ΨCES appears to contribute to packaging by regulating exposure of one or two UCUG elements located between DIS-2 and SL-C.41,100 These elements are sequestered by base pairing within the lower stem of the monomeric DIS-2, and a register shift in base pairing that occurs upon dimerization exposes one or both of the UCUG elements, thereby enabling high affinity binding to NC.41 An upstream element that also promotes dimerization and packaging (DIS-1) was shown to behave similarly.41 In addition, a large fragment of the 5′-UTR that spans from the primer binding site (PBS) through the gag start codon and includes all residues necessary for efficient RNA packaging also exhibits dimerization-dependent NC binding, with the monomeric RNA binding 1 or 2 NC proteins with high affinity and dimer binding tightly to 12 NC molecules.100 Mutations in this large RNA that inhibited dimerization in vitro led to dramatic reductions in RNA packaging in vivo.100 These findings collectively suggest that diploid genome selection is regulated by an RNA structural switch mechanism in which ~12 high affinity NC binding sites with the MoMuLV 5′-UTR become exposed upon dimerization. A resulting Gag12:RNA2 complex could then serve to nucleate virus assembly, Figure 7. Such a mechanism is consistent with recent findings for HIV-1 assembly, which appears to be initiated by a complex comprising two RNA genomes and fewer than ~12 Gag proteins.101
The double hairpin is immediately preceded by two UCUG sequences (U301CUG304 and U306CUG309), and is immediately followed by a closely related U377UCG380 sequence, all of which are potentially capable of binding NC with high affinity if maintained in an exposed, unstructured-like state.102 We propose that the tandem hairpin functions not only to promote dimerization, but also as a scaffold that helps maintain exposure of a clustered group of NC binding sites. Although U301-G304 were shown to participate in non-canonical base pairs that extended the length of the DIS-2 helix in isolated ΨCES fragments,43,44 modeling studies indicate that residues U301 through G309 would need to exist in a relatively extended conformation in order to bridge between the well-separated ends of the DIS-2 duplex and the relatively proximal 5′-ends of the [ΨCES]2 dimer. Thus, in the context of the native [ΨCES]2 dimer, residues U301CUG304 and U306CUG309 should both exist in extended, unstructured conformations, affording a total of six high affinity NC binding sites per dimer (including the 3′-proximal U377UCG380 element). The stage is now set for structural and NC (and Gag) binding studies of a MoMuLV ΨCES construct that contains all six predicted NC binding sites (underway).
pMoMuLV-SL-BCD-HDV was generated by polymerase chain reaction (PCR) on pNCA, which contains the proviral DNA of MoMuLV. DNA fragments were amplified with oligonucleotide MoMuLV-276f (5′-CCAGTGAATTCTAATACGACTCACTATAGGCGGTACTAGTTAGCTAACTAGCTCTGTATCTGGCGG-3′), carrying an Eco RI site and a T7 RNA polymerase promoter, and oligonucleotide MoMuLV-374r (5′-CCAGTGCTAGCCCCTGGGACGTCTCCCAGGGTTGCGGCCGGGTGTT-3′), carrying an Nhe I site. The amplified product was inserted in the pHDV4 plasmid (kind gift from Dr. Conn) after EcoR I and Nhe I digestions (New England Biolabs).
pMoMuLV-SL-BCcwapD-HDV was generated by PCR based mutagenesis. Oligonucleotide MoMuLV-Cswapf (5′-GCGGACCCGTGGTGGAACAGACGTGTTCGGAACACCCGGCCGC-3′) and MoMuLV-Cswapr (5′-GCGGCCGGGTGTTCCGAACACGTCTGTTCCACCACGGGTCCGC-3′) were used to insert mutations (nt 328 T to A and nt 333 T to A). Two DNA fragments were amplified on pMoMuLV-SL-BCD-HDV with oligonucleotide M13f (5′-CGCCAGGGTTTTCCCAGTCACGAC-3′), annealing to pUC19 at the 5′ end of the primer lies 47 nucleotides upstream of the EcoRI restriction site, and oligonucleotide MoMuLV-Cswapr, and with oligonucleotide MoMuLV-Cswapf and oligonucleotide M13r (5′-AGCGGATAACAATTTCACACAGGA-3′), annealing to the pUC19 vector at the 5′ end of the primer lies 48 nucleotides downstream of the HindIII restriction site, respectively. Another PCR was performed on the two PCR products with oligonucleotide M13f and oligonucleotide M13r. The amplified product was inserted in the pUC19 plasmid after Eco RI and Hind III digestions (New England Biolabs).
pMoMuLV-SL-CD-HDV was generated by PCR on pMoMuLV-SL-BCD-HDV, which contains the proviral DNA of MoMuLV. DNA fragments were amplified with oligonucleotide MoMuLV-309f (5′-CCAGTGAATTCTAATACGACTCACTATAGGCGGACCCGTGGTGGAACTGACGAGTTCGGAACAC-3′), carrying an Eco RI site and a T7 RNA polymerase promoter, and oligonucleotide M13r. The amplified product was inserted in the pUC19 plasmid after EcoR I and Hind III digestions (New England Biolabs).
pMoMuLV-SL-CcwapD-HDV was generated by PCR. Two DNA fragments were amplified on pMoMuLV-SL-CD-HDV oligonucleotide MoMuLV-309mf (5′-CCAGTGAATTCTAATACGACTCACTATAGGCGGACCCGTGGTGGAACAGACGTGTTCGGAACAC -3′), carrying an Eco RI site, a T7 RNA polymerase promoter and two mutations in SL-C sequence, and oligonucleotide M13r. The amplified product was inserted in the pUC19 plasmid after Eco RI and Hind III digestions (New England Biolabs).
PCR was performed on pMoMuLV-SL-CcswapD-HDV with oligonucleotide HH-354in-f (5′-GGTCTGATGAGAGCGAAAGCTCGAAACAGCTGTGAAGCTGTCACCCTGGGAGACGTCCCAGGG-3′), carrying a part of the hammer-head sequence and oligonucleotide M13r. The resulting PCR product was used in a second PCR reaction performed with oligonucleotide HH-354out-f (5′-CCAGTGAATTCTAATACGACTCACTATAGGGAATCCAGGGTCTGATGAGAGCGAAAGCTCGAAACAGC-3′), carrying an Eco RI site, a T7 RNA polymerase promoter, and a part of HH sequence, and M13r. The amplified product was inserted in the pUC19 plasmid after Eco RI and Hind III digestions (New England Biolabs).
The partially deuterated rUTP reagents used for in vitro transcription was obtained from Cambridge Isotope Laboratories (CIL, Massachusetts) and ProSpect Pharma (Columbia, MD), and perdeuterated rNTPs were from CIL and Silantes (GmbH, Germany). Protonation at the C8 position of perdeuterated rGTP and rATP was achieved by incubation with ~5 equivalents of Triethylamine (TEA) in 1H2O at 60 °C for 24 hours and for 5 days, respectively.103,104 Deuteration of the C8 carbon of fully-protonated GTP and ATP was achieved by incubation in 2H2O (99.8 % deuteration; CIL). The TEA was subsequently removed by lyophylization.
All plasmids were amplified using E. coli DH5α. The pMoMuLV-SL-CD-HDV, pMoMuLV-SL-CcwapD-HDV and pMoMuLV-HH-SL-D-HDV plasmids were linearized by Eco RV (New England Biolabs) at 1 μg/unit for 16 hours at 37 °C. The linearized DNA was then extracted twice with phenol-chloroform and precipitated with ethanol. The pellet was washed with 70 % (v/v) ethanol, and the DNA was dissolved in sterile distilled water. Synthetic oligo (5′-TGCGGCCGGGTGTTCCGAACTCGTCAGTTCCACCACGGGTCCGCCAGATACAGAGCTAGTTAGCTAACTAGTACCGCCTATAGTGAGTCGTATTA-3′) was used as the template for SL-BCm, wherein the first and second nucleotides contained 2′-O-CH3 methyl groups to prevent the addition of non-templated nucleotides. This template was mixed with DNA containing the positive-sense strand of the T7 promoter (5′-TAATACGACTCACTATA-3′), heated and then cooled to anneal the two strands so as to form the double-stranded promoter.
RNAs were synthesized by in vitro transcription105 using T7 RNA polymerase in 30 ml reactions, each containing 2.5 mg of template, 20 mM MgCl2, 2 mM spermidine, 80 mM Tris-HCl (pH 8.1), 4 mM of each NTP, 2 mM DTT, and 0.3 mg T7 RNA polymerase. To cleave ribozyme, further 10 mM MgCl2 were added to transcription mixtures and were incubated for 16 hours and quenched by 35 mM EDTA. Selectively protonated RNA samples were prepared using appropriate combinations of deuterated and fully or partially protonated NTPs. After denaturation at 96 °C for five minutes, the RNA was purified by electrophoresis on urea-containing polyacrylamide denaturing gels. The concentration of each sample was determined by measuring the optical absorbance at 260 nm.
The donor SL-D RNA synthesized using the pHH-SLD-HDV plasmid (a kind gift from Graeme L. Conn) 52 contains a 2′,3′-cyclic-phosphate at the 3′ end and a hydroxyl group on the 5′ end after ribozyme cleavage. The presence of a 2′,3′-cyclic-phosphaphate is advantageous since it prevents undesired self ligation products. However, ligation of two RNA fragments requires the donor RNA to be 5′ mono-phosphorylated. In order to preserve the 3′ end, this was achieved by treating SL-D with polynucleotide kinase that is defective for 3′ phosphatase activity (New England Biolabs). The acceptor SL-BC RNA was synthesized using an oligonucleotide template and hence contained the desired 5′ triphospate terminal and 3′ hydroxyl terminal and could be used without further treatment. The 5′ triphosphate terminal is not capable of self-ligation and the 3′ hydroxyl terminal acts as an acceptor during the ligation reaction. The two fragments (SL-BCm: SL-D = 1.5 : 1.0) were mixed in 50 mM Tris-HCl (pH. 7.8), 10 mM MgCl2, 1 mM ATP, 10 mM Dithiothreitol, 15 % of PEG8000 (w/v) and RNA ligase (New England Biolabs).
MoMuLV RNA packaging was assessed using derivatives of MoMuLV-gag-pol-puro: an intact proviral construct in which the env open reading frame has been replaced by the puromycin N-acetyltransferase gene (puroR) driven by a simian virus 40 (SV40) promoter 106. The “classic Ψ” deletion (Δ215-568) and the SL-C mutant (U328A/A333U) were generated by overlap PCR, sequenced, and used to replace the corresponding portions of the parental MoMuLV-gag-pol-puro vector. The helper plasmid pNGVL-3′-gag-pol, which expresses MoMuLV Gag and Gag-Pol from a 5′ leader-deleted transcript driven by the CMV promoter, has been described previously 107.
293T cells (human embryonic kidney cells expressing SV40 T antigen) were grown in Dulbecco’s modified Eagle’s medium supplemented with 10% fetal calf serum and 1% penicillin-streptomycin. Virus was produced by transient transfection of the MoMuLV-gag-pol-puro or pNGVL-3′-gag-pol plasmids into 293T cells by calcium phosphate precipitation 108. Virus-containing media were harvested and pooled at 24, 36, and 48 hours post transfection, filtered through a 0.2-μm filter, and stored at −70° prior to use.
RNA was isolated from pelleted virus using a previously described proteinase K-based extraction protocol 22 and quantified by RNase protection using a chimeric MoMuLV-7SL riboprobe complementary to both portions of the MoMuLV 5′ untranslated region (nts. 55-255) and part of 7SL (Figure 2d) 49. Previously described RNAse protection assay approaches 49 were modified by extending hybridization times to 16h and digesting using only RNase T1. The riboprobe protected 201nt of the wild type and the SL-C mutant (U328A/A333U) RNAs, 159nt of the “classic Ψ” deletion (Δ215-568) RNA, and 100nt of 7SL RNA. Bands were quantified by PhosphorImager analysis and adjusted for the number of radiolabelled Cs incorporated. RNA packaging was quantified by normalizing the amount of MoMuLV RNA in each lane to the amount of co-packaged 7SL.
Samples for NMR studies (250 μL of 800 μM RNA in D2O (99.8 %; Cambridge Isotope Laboratories, Massachussetts) in a Shigemi sample tube (Japan)) were prepared in TRIS-d11 buffer (pD = 7.0) containing NaCl (80 mM). NMR data were collected with a Bruker AVANCE spectrometer (800 MHz, 1H; sample temperature = 37 °C), and were processed with NMRPipe/NMRDraw109 and analyzed with NMRView110. Non-exchangeable 1H assignments were obtained from 2D NOESY111,112 (τm=100 ms) data collected at 37 °C, with real and imaginary components of indirect dimension collected in an interleaved manner to minimize t1 artifacts.
Structures were calculated and refined with CYANA113 using the AMBER114 residue library. Upper-limit distance restraints of 2.7 Å, 3.3 Å and 5.0 Å were employed for direct NOE cross peaks of strong, medium and weak intensities, respectively, for all cross peaks except those associated with the intraresidue H8/6-to-H2′ and –H3′ interactions. For these proton pairs, upper distance limits of 4.2 Å and 3.2 Å were therefore employed for NOEs of medium and strong intensity, respectively.44 Cross-helix P-P distance restraints (with 20% weighting coefficient) were employed for A-form helical segments to prevent the generation of structures with collapsed major grooves:44,53,56 P(i)-P(i+2) (cross-helix phosphorus of the i+2 base pair) = 16.1 - 17.1 Å, P(i)-P(i+3) = 14.2 - 15.2 Å; P(i)-P(i+4) = 11.7 -12.7 Å; P(i)-P(i+5) = 9.4 -10.4 Å; P(i)-P(i+6) = 9.0 -10.0 Å. Torsion angle restraints for A helical stem residues were centered around published A-form RNA values (α=-62°, β=180°, γ=48°, δ=83°, ε=−152°, ζ=−73°)115 with allowed deviations of ± 50°. Four restraints per hydrogen bond were employed to enforce approximately linear NH-N and NH-O bond distances of 1.85 ± 0.05 Å, and two lower limit restraints per base pair were employed to weakly enforce base pair planarity (20% weighting coefficient) (G-C base pairs: G-C4 to C-C6 ≥ 8.3 Å and G-N9 to C-H6 ≥ 10.75 Å. A-U base pairs: A-C4 to U-C6 ≥ 8.3 Å and A-N9 to U-H6 ≥ 10.75 Å).
The RNA sample (24 μM [ΨCD]2 in TRIS buffer, pH 7.0, containing 140 mM KCl and 2 mM MgCl2) was applied onto glow-discharged 200-mesh Quantifoil R 1.2/1.3 copper holey grids (Quantifoil Micro Tools GmBH, Germany) pre-treated with 15nm gold particles. The gold particles were used as fiducial markers during the tomographic reconstruction. The grid with the sample was blotted and vitrified using Mark III Vitrobot (FEI, USA) and stored in liquid nitrogen. Data collection was performed on a JEM2200FS electron microscope with a field emission gun and an in-column energy filter (energy slit of 10eV) at 200kV and ~23,123x detector magnification from −60 to +60° in 2° increments. The intended defocus was 8 μm. The cumulative specimen dose was ~70-90 electrons/Å2. Tilt series were collected on a Gatan 4k × 4k CCD camera using SerialEM 116. Final sampling was 6.49Å/pixel. Tomograms were reconstructed using IMOD and visualized by 3dmod.117
Densities visually examined with Chimera118 could be grouped qualitatively by volume. Two general groups were observed, one in which the densities had volumes approximately consistent with a 132 nucleotide RNA, and another in which densities had considerably larger volumes. In some cases the density was found to be extended in the third dimension, forming elongated, non-symmetric assemblies (i.e linear chains rather than the expected dimers), which was not obvious from the two dimensional projection. A total of 47 subvolumes from the group with smaller densities was subjected to an all-vs-all orientation search, which was performed independent of the NMR-derived [ΨCD]2 structure.119 In order to reduce the noise, the raw subvolumes were low-pass filtered to 40Å and spherically masked to a size slightly larger than a single particle. To speed the computational analysis, the data were shrunk in volume by a factor of two. Resulting rotations and translations were applied back to the original “raw” (i.e. not low pass filtered, masked or shrunk) subvolumes before averaging. It should be noted that because of the limited tilt range (−60 to +60°), it was easier to identify [ΨCD]2 structures that were oriented with their long axes more or less along the “z”, simply because this orientation produces the highest density in projection.
The averaging method used is an iterative procedure to align subvolumes to each other. The initial averages were obtained from the pairs of particles having the highest cross-correlation (excluding pairs where either member appears higher in the list).119 This procedure eventually generated two similar 3D class averages, with 33 and 14 subvolumes, respectively. In the next round of refining the classification, the 33-subvolume cryo-ET average was used as a template to which all initial subvolumes were compared and translation and rotation operations were applied. The number of subvolumes that were included in the final average was determined by the quality of their cross-correlation to the initial average and included a final total of 38 (Supplemental Figure S4). Additional control specimens (buffer, [ΨCES]2), see Results section) were prepared and analyzed as described above.
Support from the NIH (GM42561 to M.F.S., CA069300 to A.T., P41RR02250 to W.C.) is gratefully acknowledged. A.S.-M. was supported by NIH MARC U*Star (GM08663) and HHMI education grants, and R.I. was supported by NIH Training Grant No 5T90DK070121-05 through the Gulf Coast Consortia. We are grateful to Alexander Grishaev (NIAID, NIH) for collecting SAXS data and assisting in its interpretation.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Accession Numbers. Coordinates have been deposited in the Protein Data Bank with accession number 2l1f. 1H NMR chemical shift assignments have been deposited in the BMRB with accession number 17083.