Genomes are extensively transcribed to generate diverse coding and regulatory RNAs whose structure and function remain to be characterized.1
Despite being composed of only four chemically similar nucleotides, RNA can base pair with itself and interact with other molecules to form complex secondary and tertiary structures. Predicting how RNAs fold and their corresponding functions are formidable challenges. Recently, in vitro
RNA structure-probing experiments have improved the accuracy of secondary structure models, adding to our knowledge of RNA structural motifs;2,3
however, RNA structure in vivo
is likely to be more complex, and in some cases fundamentally different than what is observed in vitro
RNA structure in cells is influenced by the rate of transcription, local solution conditions, the binding of small molecules, and interactions with numerous RNA-binding proteins.5
These observations hint that the physical state of RNA within the cell is crucial for its function, but current knowledge of RNA structures in cells is limited.
Several reagents have been developed to obtain structural maps of RNAs. Ribonucleases that are specific for single or double-stranded RNA can provide low-resolution measurements of RNA secondary structure.6,7
Alternatively, chemical mapping by direct RNA base methylation by dimethylsulfate (DMS), or cleavage of solvent exposed residues by hydroxyl radical or metal catalysis are currently the most rigorous methods for determining RNA secondary and tertiary structure.2,3
However, reagents such as DMS and kethoxal do not react with all four bases in RNA, and lead (II) cleavage has strong biases near sites of metal coordination that may confound the interpretation of secondary structure measurements. Such biases have limited the ability to obtain structural information for every nucleotide. While DMS has been used for RNA structure probing in vivo
other reagents cannot be used in intact mammalian cells. These observations point toward the need for a generalizable reagent that can be used in a wide range of organisms to accurately measure RNA structure inside living cells.
The 2’-hydroxyl group is a universal chemical feature in every RNA. Pioneering work by Weeks and colleagues has demonstrated the feasibility of a general method to measure the dynamics and solvent exposure of local RNA structure by probing the 2’- hydroxyl groups. Single-stranded or flexible RNA regions exhibit high 2’-hydroxyl reactivity, whereas RNA nucleotides engaged in base pairing or other interactions show lower reactivity. Selective 2’-hydroxyl acylation followed by primer extension, or SHAPE, has become the gold standard for monitoring the secondary structures of complex RNAs. 9–11
An additional strength of SHAPE chemistry is that the reactivities of the 2’-hydroxy groups can be integrated to obtain pseudo-free energy terms3, 10
; these terms can then be applied to correct RNA secondary structure models, increasing their accuracy to ~95%11
. Thus, SHAPE reagents are powerful probes for measuring and predicting RNA secondary structure at single-nucleotide resolution.
SHAPE was recently used to interrogate the structural landscape of genomic RNA extracted from viral particles and inside intact virions.12,13
However, long incubation times (~50 min) and the comparably simple viral environment suggested to us that current SHAPE electrophiles may not be amenable to in vivo
experiments in more complex cells. Indeed, we attempted to modify RNA in different species with NMIA, the canonical SHAPE electrophile. Despite numerous efforts, we could not detect any modification of 5S rRNA in these cells (see below). We surmised that this limited reactivity might be due to the high reactivity of NMIA (leading to a very short effective half-life in water), to its high degree of cross-reactivity with other nucleophiles in the cell, and to its limited solubility in aqueous solutions.
We sought to develop novel acylation electrophiles that are selectively reactive toward hydroxyl groups, soluble at high concentrations, and amenable to RNA modification inside living cells within a reasonable time frame. We screened several aromatic electrophiles (Supplementary Fig. 1
), but none of these were able to fit all desired parameters. We designed and tested electrophiles with different predicted solubilities and leaving groups, and devised simple synthetic strategies that proceed in high yield (Supplementary Fig. 2
), ultimately choosing 2-methylnicotinic acid imidazolide (NAI) and 2-methyl-3-furoic acid imidazolide (FAI) (). These compounds are conveniently produced as 1:1 mixtures with imidazole in a DMSO stock solution by reaction of the carboxylic acids with carbonyldiimidazole. Both reagents display similar hydroxyl acylation specificities as NMIA ( and Supplementary Fig. 3
). The heteroatoms were included in the aromatic rings to increase solubility, and adjacent methyl groups tune reactivity by causing a twist to the carbonyl groups. The low-toxicity imidazole leaving groups were designed to modulate reactivity while retaining solubility. The hydrolysis rates of NAI (t1/2
=33min) and FAI (t1/2
=73min) are considerably lower than NMIA (t1/2
. RNA extraction or addition of betamercaptoethanol successfully halts the acylation (Supplementary Fig. 4
) in vitro
and in vivo
; this step allows the experimenter to terminate the reaction at will.
Design and experimental evaluation of NAI and FAI
First, we used NAI and FAI to probe the secondary structure of mouse embryonic stem cells 5S rRNA in vitro
. We chose 5S rRNA because of its abundant nature, its highly characterized structure, and its ability to fold into a stable structure without the need for protein cofactors.14
We observed very similar quantitative patterns of 2’-hydroxyl acylation with NAI or FAI versus NMIA (R2
=0.93) which all map to residues that are predicted to be flexible ( and Supplementary Fig. 5, 10, 11
). These data suggest that both NAI and FAI are suitable electrophiles for 2’-hydroxyl acylation on structured RNA molecules, yielding accurate structural information comparable to that obtained with existing probes.
Next we evaluated the ability of NAI and FAI to monitor RNA structure in live cells. Cultured mouse embryonic stem cells (ESC) were reacted with 13 mM NAI or FAI (the maximum solubility of NMIA) in aqueous buffer, but no 5S rRNA modification could be detected after one hour with the probes at this concentration. However, when the concentration of the probes was increased to above 20 mM, we observed positive signals for modification. Both NAI and FAI employed in the cellular experiments caused blocks in subsequent reverse transcription, which was suggestive of modification, while NMIA did not (Supplementary Fig. 6
). In addition, NAI showed greater extent of modification than FAI, consistent with its higher reactivity toward ATP in vitro
We focused on NAI for further structural probing due to its higher reactivity. Application of 100 mM NAI to murine ESCs resulted in 5S rRNA modification in as little as one minute with suitable signal-to-noise ratio. This signal begins to plateau by 15 minutes (), indicating a reasonable timescale to study biological phenomena. Even after 30 minutes of NAI treatment, murine ESCs remained attached to tissue culture vessel, appeared morphologically normal and unstained by trypan blue (Supplementary Fig. 7
). NAI is expected to acylate many cellular RNAs and other molecules; thus cytotoxicity after longer-term treatment needs to be evaluated with care. To test the ability of our reagent to modify lower abundant RNAs, we also determined the SHAPE pattern for two nuclear localized RNAs. We were able to detect significant RNA modification, which suggests that our reagent is able to enter the nucleus and react with lower abundant RNAs to give structural information (Supplementary Fig. 8
). NAI also modifies 5S rRNA in cultured human cancer cells, Drosophila
S2 cells, yeast, and E. coli
(Supplementary Fig. 9
), suggesting that it is a general cell-permeable probe of RNA structure.
To understand the pattern of 5S rRNA SHAPE in ESCs, we compared our data to the crystal structure of the 80S ribosome from yeast, which includes the 5S rRNA15
. Yeast and mammalian 5S rRNA exhibit very high sequence similarity and functional domain architecture,16,17
as indicated by a CLUSTALW alignment score of 60.17
The crystal structure of the ribosome is validated by decades of molecular genetics and biochemical studies and likely represents a conformation that occurs in vivo
. Overlaying our SHAPE data to the 5S crystal structure showed that practically all residues in flexible regions or not in canonical Watson-Crick base-pairs are modified, including singlestranded loops, unstable non-canonical base-pairs, and a single base flipped out of the helical duplex (). These results indicate that NAI can probe RNA structure in vivo
with high accuracy and single-nucleotide resolution.
Comparison of SHAPE profiles of 5S rRNA in vivo
versus in vitro
revealed key RNA-RNA and RNA-protein interactions that dock the 5S rRNA into the ribosome. Overall, the profiles looked similar, but a few key differences suggest differential interactions in the living system (). Hereafter, residues in 5S are numbered per the mouse gene (M. musculus
); residues in other ribosomal subunits are numbered as in the yeast crystal structure (S. cerevisiae
). First, major differences between the in vitro
and in vivo
modification profiles were observed with residue M.M
A50. Within the context of the crystal structure we noticed that the analogous residue S.C.U
50, which sits near the nexus of Loop B and Helix III, is kinked to allow the docking of Loop C into the 28S rRNA (Resi. S.C
.C2684 and S.C.
U2683). This conformation permits the residues S.C.
GLU221, and S.C.
LYS224 of ribosomal protein L5 to be stacked against residues S.C
.A51 and S.C
.U50 (). As a result, S.C.
U50 seems to be pushed out of the helix, thus increasing its dynamic nature and exposing the 2’-OH for reactivity. Prior saturation mutagenesis showed that S.C
.A51 and S.C
.U50 make contact with ribosomal proteins S.C.
L11 and S.C.
L5 to form a critical structural link between the large and small ribosomal subunits and are essential for proper ribosome function and viability16,18
. Thus, NAI can read out alterations in the RNA tertiary structure as a result of sampling critical mature ribosome conformations.
5S rRNA has different modification patterns in cells
Second, a three-nucleotide bridge that connects Helix II with Loop A showed slight differences (Fig. 2e, Supplementary Fig. 11
). Within the crystal structure S.C.
U12, and S.C.G67 are engaged in a multi-nucleotide bridge.15
G67 and ribosomal protein L5 interact with these residues and may stabilize them in lower-reactive conformations. S.C.
C10 also moves out of the helix to stack on S.C.
PHE20 from ribosomal protein L5. This conformation may stabilize and shield residues S.C.
C10 and S.C.
U12 from sampling the same reactive conformations as seen in vitro.
Third, the residues of M.M.
A83, which are in Loop D, were more reactive to NAI in vitro
. Within the context of the 80S ribosome these residues are engaged in extensive hydrogen-bond contacts with residues S.C.
G1148 and S.C.
G1171 of 28S S.C.
U86 is stacked upon S.C.
A1197 and is in an H-bonding contact with cobalt hexamine (). Notably, mutations of these 5S residues in yeast result in gross defects in translational accuracy.16
These differences suggest that NAI is able to distinguish subtle dynamic differences that may be the result of protein interactions, yet can still identify residues that are unpaired and therefore more flexible in the context of the cell. These findings suggest in vivo
versus in vitro
SHAPE comparison as a powerful unbiased strategy to pinpoint key residues in ncRNA interaction and function.
Loop E of 5S RNA provided a prime example of the power of in vivo
SHAPE analysis. In the context of the fully assembled ribosome, loop E adopts a unique bulge structure in yeast 5S RNA, but not in other species studied.19,20
. The S.C.
crystal structure shows C72 and C73 are pushed out of Loop E. These residues also have the highest bfactor, suggesting these residues are highly dynamic (). We confirmed that loop E is accessible in vivo
only in yeast, but not other species (Supplementary Fig. 9
). Comparison of in vitro
vs. in vivo
modified yeast 5S rRNA revealed conserved residues with similar modification patterns as in the mouse 5S rRNA, including the key residue S.C.U
U49). Further, residues with lower b-factors in the crystal structure were shown to be less reactive in vivo
. Importantly, residues C72 and C73 displayed the largest differences in the cell, with a marked increase in reactivity (). Moreover, residues with altered modification pattern in vivo
are nearly always required for 5S function in vivo
when mutated (). Our analysis is the first comparison of 5S rRNA structure in vitro
versus in vivo
. Overall, our experiments establish that our acylation reagents are capable of modifying RNA in vivo
and can sensitively read characteristics of RNA structure that are the result of unique conformations that RNA adopts in the cell, either due to changes in base-pairing characteristics or protein-RNA interactions.
Our results suggest an approach to directly probe RNA structure in living cells and assess dynamic changes in RNA structure in different cell states or in cells knocked out for any gene. Importantly, we demonstrate the utility of our chemical probe to read out specific and sensitive structural differences in RNA as a function of its tertiary structure and RNA-protein interactions. Currently, obtaining physiologically relevant RNA structural information requires much effort to reconstitute ribonucleoprotein complexes in vitro
. As the catalog of non-coding RNA molecules21–23
and functional motifs in coding transcripts24,25
continues to expand, the need to probe their structure in the cell will become increasingly important.