|Home | About | Journals | Submit | Contact Us | Français|
i-Motifs are four-stranded DNA structures consisting of two parallel DNA duplexes held together by hemi-protonated and intercalated cytosine base pairs (C:CH+). They have attracted considerable research interest for their potential role in gene regulation and their use as pH responsive switches and building blocks in macromolecular assemblies. At neutral and basic pH values, the cytosine bases deprotonate and the structure unfolds into single strands. To avoid this limitation and expand the range of environmental conditions supporting i-motif folding, we replaced the sugar in DNA by 2-deoxy-2-fluoroarabinose. We demonstrate that such a modification significantly stabilizes i-motif formation over a wide pH range, including pH 7. Nuclear magnetic resonance experiments reveal that 2-deoxy-2-fluoroarabinose adopts a C2′-endo conformation, instead of the C3′-endo conformation usually found in unmodified i-motifs. Nevertheless, this substitution does not alter the overall i-motif structure. This conformational change, together with the changes in charge distribution in the sugar caused by the electronegative fluorine atoms, leads to a number of favorable sequential and inter-strand electrostatic interactions. The availability of folded i-motifs at neutral pH will aid investigations into the biological function of i-motifs in vitro, and will expand i-motif applications in nanotechnology.
Since the discovery of the DNA double helix, the understanding of DNA structure has expanded considerably, with a plethora of polymorphs now known (1). Two structures have attracted considerable interest due to their possible involvement in telomere maintenance and gene regulation: (i) the G-quadruplex (G4) (2,3), which consists of stacked guanine tetrads stabilized by monovalent cations (K+/Na+); and (ii) the i-motif (4), a cytosine-rich structure consisting of two hemi-protonated (C:CH+) parallel duplexes intercalated in an antiparallel orientation (5,6).
There is increasing evidence for the occurrence of the i-motif in regulatory regions of the human genome such as centromeres (7,8), telomeres (9,10) and oncogene promoter regions (11,12). These structures are also attracting much interest in nanotechnology, for example in the control of macromolecular structure assembly (13) and as pH responsive switches (14,15).
The i-motif is a compact DNA structure characterized by short (3.1 Å) base-pairing distances, a 12–16° helical twist between adjacent C:CH+ base pairs (Figure (Figure1A),1A), and close sugar–sugar contacts stabilized by CH…O interactions (16). i-Motif structures can have different intercalation topologies known as 3′E and 5′E. In the 3′E topology, the terminal C:CH+ base pair is at the 3′-end while the 5′E has the terminal C:CH+ base pair at the 5′-end (9). Repulsive interactions exist between the charged C-imino protons and between the phosphate groups across the crowded narrow groove. In vitro, unmodified i-motif structures are stable across a narrow pH range (~3.5–5.5), with maximum stability occurring at pH equal to the pKa of the cytosine N3. At pHs above the cytosine N3 pKa, the stabilizing C:CH+ base pairing is lost and the i-motif structure is denatured. However, it is likely that in vivo factors such as negative superhelicity (17), cellular proteins (18) and molecular crowding conditions (19) stabilize i-motif DNA structures under neutral pH. To help compensate for the absence of these factors in vitro, small molecule ligands and certain metal ions (Cu+2; Ag+) have been introduced to promote i-motif folding at neutral pH (20). Of the sugar and phosphate modifications so far reported, none has provided sufficient stability at neutral pH to allow for biochemical studiesin vitro (21–25).
The present study was prompted by observations that arabinose (2′OH) (26), but not ribose (2′OH (27) or 2′F (28)), is well tolerated within the i-motif structure. We hypothesized that 2′F-arabinose (2'F-araC) would enhance interstrand CH…O interactions within the narrow grooves (Figure (Figure1A).1A). Sequential FC-H2′…O4′ interactions have been previously observed in duplexes (29) and G-quadruplex structures (30). Furthermore, such a modification could potentially stabilize the i-motif by augmenting ion-dipole and stacking interactions of C:CH+ base pairs (16). To explore the effect of 2′-β-fluorination on the structure and stability of different families of i-motif structures, 2′F-araC modifications were introduced in various i-motif forming sequences. We show that replacing 2-deoxyribose sugar with 2-deoxy-2-fluoroarabinose enables formation of stable i-motif structures over a wide pH range with thermal stabilities of ~30° at neutral pH.
Oligonucleotide synthesis was performed on an ABI 3400 DNA synthesizer from Applied Biosystems at 1 μmol scale on Unylinker (Chemgenes) CPG solid support. Thymidine (dT), deoxycytidine (N-acetyl) (dC) and deoxyadenosine (N-Bz) (dA) phosphoramidites were used at 0.1 M concentration in acetonitrile, and coupled for 110 s. 2′F-araC was used at 0.13 M concentration and coupled for 600 s. After completion of the synthesis, CPG was transferred to a 1.5 ml screw-cap eppendorf. A total of 1000 μl aqueous ammonium hydroxide was added and the eppendorf was placed on a shaker at room temperature for 48 h. The deprotection solution was centrifuged and decanted from the CPG. Samples were vented for 30 min, chilled in dry ice and evaporated to dryness. Hexamer sequences were purified by anion exchange HPLC on a Waters 1525 instrument using a Protein-Pak DEAE 5PW column (21.5 mm × 15 cm). The buffer system consisted of water (solution A) and 1 M aqueous lithium perchlorate (solution B), at a flow rate of 4 ml/min. The gradient was 0−40% solution B over 50 min at 60°C. Under these conditions the desired peaks eluted at roughly 28 min. The centromeric (HC) and telomeric (HT) sequences were purified by anion exchange HPLC on an Agilent 1200 Series Instrument using a Protein-Pak DEAE 5PW column (7.5 × 75 mm) at a flow rate of 1 ml/min. The gradient was 0–24% solution B over 30 min at 60°C. Under these conditions, the desired HC and HT peaks eluted at roughly 21 and 24 min, respectively. Samples were desalted on NAP-25 desalting columns according to manufacturer protocol. The extinction coefficient of the 2′F-araC was approximated to that of the unmodified sequence. Masses were verified by ESI-MS.
UV thermal denaturation data were obtained on a Varian CARY 100 UV-visible spectrophotometer equipped with a Peltier temperature controller. The concentration of oligonucleotides used was 4.6 μM for the hexamer sequences and 4 μM for the centromeric and telomeric sequences. Samples were dissolved in appropriate buffer as indicated in the text. Concentrations were determined after quantitating the samples by UV absorbance at λ = 260 nm. Samples were heated to 90°C for 15 min, then cooled slowly to room temperature and stored in the fridge (5°C) at least 18 h before the measurements were performed. Denaturation curves were acquired at 265 nm at a rate of 0.5°C/min. Experiments at pH 5.0 and pH 7.0 with 0.5°C/min were performed at least in triplicates and experiments at 0.2°C/min were ran once for comparison. Samples were kept under a nitrogen sheath at temperatures below 12°C. The dissociation temperatures were calculated as the midpoint of the transition (T1/2) using the first derivative of the experimental data.
Circular dichroism (CD) studies were performed at 5°C on a JASCO J-810 spectropolarimeter using a 1 mm path length cuvette. Temperature was maintained using the Peltier unit within the instrument. Spectra were recorded from 350–230 nm at a scan rate of 100 nm min−1 and a response time of 2.0 s with three acquisitions recorded for each spectrum. The spectra were corrected by subtraction of the buffer scan. Data were smoothed using the means-movement function within the JASCO graphing software. Oligonucleotide solutions for CD measurements were prepared with 10 mM sodium phosphate buffer (pH 5.0 and pH 7.0) in a similar manner to that used for UV melting. The concentration of oligonucleotides used was 30 μM for the hexamer sequences, 100 μM for the centromeric sequences and 50 μM for the telomeric sequences.
Native gel electrophoresis was performed utilizing 24% polyacrylamide at pH 5.0 using TAE (Tris:Acetate:EDTA) buffer. Running buffer was 1 × TAE. Samples were equilibrated in 10 mM sodium phosphate (pH 5.0) as described in the UV melting section. The gels were run at 200 V inside a refrigerator (5°C) for 12 h or at 280 V for 6 h. Gels were visualized by treatment with Stains-All dye (sigma E-9379) or UV shadowing. The oligonucleotide controls were dT12 and dT24 strands.
Samples for nuclear magnetic resonance (NMR) experiments were suspended in 300 μl of either D2O or H2O/D2O 9:1 in 10 mM sodium phosphate buffer. NMR spectra were acquired on Bruker Avance spectrometers operating at 600, 700 or 800 MHz, and processed with Topspin software. TOCSY spectra were recorded with standard MLEV17 spinlock sequence and with 80 ms mixing time. NOESY spectra in H2O were acquired with 50 and 150 ms mixing times. For 2D experiments in H2O, water suppression was achieved by including a WATERGATE (31) module in the pulse sequence prior to acquisition. Two-dimensional experiments were carried out at temperatures ranging from 5 to 25°C. 19F resonances were assigned from 1H-19F HETCOR and 19F detected HOESY spectra (32). The spectral analysis program Sparky (33) was used for semiautomatic assignment of the NOESY cross-peaks.
Distance constraints were obtained from a qualitative estimation of Nuclear Overhauser Effect (NOE) intensities. In addition to these experimentally derived constraints, hydrogen bond and planarity constraints for the base pairs were used in the initial DYANA calculations. Target values for distances and angles related to hydrogen bonds were set to values obtained from crystallographic data in related structures (34). Due to the relatively broad line-widths of the sugar proton signals, J-coupling constants were not accurately measured, but only roughly estimated from DQF-COSY cross-peaks. Loose values were set for the sugar dihedral angles δ, ν1 and ν2 to constrain deoxyribose conformation to north or south domain. No backbone angle constraints were employed. Distance constraints with their corresponding error bounds were incorporated into the AMBER potential energy by defining a flat-well potential term.
Structures were calculated with the program DYANA 1.4 (35) and further refined with the SANDER module of the molecular dynamics package AMBER 7.0 (36). The resulting DYANA structures were used as starting points for the AMBER refinement, consisting of 1 ns trajectories in which explicit solvent molecules were included and using the Particle Mesh Ewald method to evaluate long-range electrostatic interactions. Non-experimental constraints used in the initial DYANA calculations were removed in the AMBER refinement. The specific protocols for these calculations have been described elsewhere (37). The AMBER-98 force field (38) was used to describe the DNA, and the TIP3P model was used to simulate water molecules. Analysis of the representative structures as well as the MD trajectories was carried out with the programs Curves V5.1 (39) and MOLMOL (40).
Differential scanning calorimetry (DSC) was performed using a NanoDSC-III (TA Instruments, USA). All samples were 150 μM in 10 mM sodium phosphate (pH 5.0). Sample data were collected in triplicate by scanning from 5–80°C at a scan rate of 0.5°C per minute. Fitting procedures are explained in detail in the Supplementary Data. Errors were calculated according to the variance-covariance method (41).
Sequences of the oligonucleotides prepared for this study are provided in Table Table1.1. The first series (named H) is derived from dTCCCCC, a hexanucleotide that associates into a tetrameric i-motif structure (4) (Figure 1B), whereas the HT and HC series are derived from human telomeric and centromeric DNA sequences, respectively. Their control unmodified sequences; HT-0 and HC-0 form monomeric and dimeric i-motif structures respectively (Figure (Figure1C1C and D).
Circular dichroism provides a convenient way to detect the formation of i-motifs since their spectra exhibit characteristic negative and positive bands at ~265 and ~285 nm, respectively (42). As previously reported, the model sequence dTCCCCC (H-1) exhibits a positive band that decreases in magnitude and becomes blue shifted as the pH is raised above 5.5 (Figure (Figure2A).2A). By contrast, some of the modified hexanucleotides (e.g., H-3, H-4, H-5 and H-6) exhibit this characteristic i-motif CD signature at pH 6.5 (Figure (Figure2B2B and Supplementary Figure S1). Similarly, the modified centromeric and telomeric sequences maintained the i-motif CD signature at neutral pH (Figure (Figure2C2C and D). pH titrations were performed for the modified and unmodified i-motif structures to determine the pH at which 50% of the oligo is folded into an i-motif structure (pH1/2). The titration profiles were fit to a standard titration model assuming a single protonation event in order to extract populations of folded (protonated) and unfolded (deprotonated) states (Supplementary Figure S2) as a function of pH. The fraction protonated profiles indicate that the modified i-motif structures are markedly stabilized at pH 7.0 relative to the corresponding unmodified references. For example, H-4 and H-6 are roughly 40 and 50% folded at pH 7.0 respectively, whereas H-1 is 10% folded. The pH stabilizations translate to the centromeric and telomeric sequences where the fraction protonated profiles for the modified HC-3 and HT-4 sequences show folded populations of ~60 and 90% at pH 7.0 respectively, compared to the unmodified HC-0 and HT-0 that are roughly 10 and 30% folded (Supplementary Figure S2). This 6- and 3-fold increase in folded population is translated to an increase in thermal stability for HC-3 (ΔT1/2 = +18.1) and HT-4 (ΔT1/2 = +17.2). Here we use T1/2 to refer to the midpoint of the dissociation transition, owing to the observed hysteresis between forward and reverse scans for many of the studied sequences. The ΔpH1/2 from the modifications are listed in Table Table2.2. An opposite trend is observed for the nucleosides: the pKas of 2′F-araC and dC are 3.9 and 4.4, respectively. This indicates that the effective pKa of 2′F-araC residues is strongly affected by the three-dimensional structure of the i-motif. Most likely, this large pKa shift is due to the hydrogen-bonding network and electrostatic interactions in the neighborhood of the protonation sites (43).
i-Motif formation can be also detected by 1H-NMR. Signals at 15–16 ppm, characteristic of cytosine imino protons in hemi-protonated C:CH+ base pairs, were observed for most of the modified sequences over a wide temperature and pH range (Supplementary Figures S3–5). For instance, the control i-motif (H-1) exhibits no imino signals at pH 7.0 (Figure (Figure3A)3A) whereas i-motif structures formed by H-3, H-4, H-5 and H-6 maintain C:CH+ base pairing at neutral pH and at relatively high temperatures (35°C) (Figure (Figure3B3B and Supplementary Figure S3). This effect is more pronounced in the case of the dimer centromeric sequences (Supplementary Figure S4) and monomer human telomeric sequences (Supplementary Figure S5), in which 1H-NMR spectra exhibit imino signals at 15–16 ppm which are clearly visible at physiological temperatures and neutral pH.
UV melting experiments showed a significant increase in T1/2 at acidic and neutral pH in all the monomeric (HT) and dimeric (HC) sequences studied herein (Table (Table11 and Figure Figure4).4). Although very similar spectral features (NMR and CD) were observed for the native (HT-0) and modified telomeric sequences (e.g. HT-2 and HT-4), the modified structures were significantly more stable at both pH 5.0 and 7.0 relative to the unmodified strands (ΔT1/2 ≈ 17–20°C; Table Table1).1). HC-4 and HT-2 exhibit T1/2 values of 28.2 and 32.6°C at pH 7.0, respectively (Figure (Figure4).4). It is interesting to note that ΔT1/2 values obtained between modified and control oligonucleotides were not dependent on the temperature gradient used in the melting experiments (Table (Table11 and Supplementary Table S2). In addition, the dissociation/association profiles obtained at 0.5 and 0.2°C/min for every sequence studied were virtually identical, indicating that the measured processes are the same, despite the presence of hysteresis. The same stabilization effect was observed in the majority of the tetrameric sequences (Table (Table11 and Supplementary Table S1). Melting of H-1 was detected at pH 3.5-5.5, whereas melting of the modified sequences was observed over a broader pH range (3.0–7.0). However, the denaturation curves of the tetrameric structures are more complex, showing more than one transition, which suggests the occurrence of several species. The most substituted tetrameric sequences exhibit lower T1/2 values, in contrast to the general stabilization effect observed in all the other sequences. These apparently contradictory results prompted us to study the tetrameric sequences in more detail.
Native polyacrylamide gel electrophoresis (PAGE) experiments confirmed the formation of more than one species in the tetrameric i-motif structures (Supplementary Figures S6 and 7) at low temperature (5°C). For example, the denaturation curve of the singly modified oligomer (H-2, 4.6 μM; pH 5.0) exhibits a minor transition with a T1/2 of 10.0°C and a major transition with a T1/2 of 53.6°C that corresponds closely to that of the control sequence (T1/2 of 51.5°C). This is also evident in a native gel, where H-2 (40 μM, Supplementary Figure S6) appeared as a mixture of closely moving bands that migrated with the same mobility as the tetrameric control structure H-1. Results obtained for H-2, H-3, H-4 and H-5 from concentration-dependent gel electrophoresis experiments are consistent with the formation of topologically similar i-motif structures at all concentrations studied (4.6–200 μM; Supplementary Figure S7). A major species was observed for the unmodified control sequence at 4.6–20 μM, with a minor component appearing on the gel at >40 μM strand concentration (Supplementary Figure S7) (4). Interestingly, the fully substituted sequence (H-6) exhibited a significantly lower T1/2 value (37.5°C, Table Table1),1), fast association-dissociation kinetics (Supplementary Figure S8) and aberrant gel electrophoretic mobility (Supplementary Figure S7), all consistent with the formation of a dimeric structure in this case. For instance, at pH 5.0 and 40 μM, H-6 appears as one fast moving band and an additional band exhibiting co-migration with the H-1 control (Supplementary Figure S7; lane 6). The slow moving band corresponds to a tetrameric structure and becomes the predominant species at higher strand concentration (100–200 μM; Supplementary Figure S7).
The wide dispersion of fluorine chemical shifts in the 19F NMR spectra facilitates the study of multiple species in equilibrium under various experimental conditions (44). In most of the tetrameric structures, the number of 19F signals is not consistent with a single conformation, in agreement with PAGE data. 19F resonances from different species, including the unfolded oligonucleotide, can be observed simultaneously in the whole range of temperatures explored. This indicates that the equilibria are slow on the NMR timescale at all temperatures, as observed previously in other 2′F-araC-modified oligonucleotides (45,29). With the exception of sequences H-5 and H-6, 19F spectra recorded at high (1-2 mM) and low (0.1 mM) concentrations are very similar (Supplementary Figure S9).
As observed in UV melting and gel electrophoresis experiments, the fully substituted sequence (H-6) exhibits distinctive features. 19F NMR spectra for H-6 at high and low oligonucleotide concentration are completely different (Supplementary Figure S9e). Thus, 19F NMR spectra confirm the presence of species of different molecularity, as suggested by UV and gel electrophoretic mobility experiments. 19F signals of the low concentration species are observed in the high concentration sample prepared under a fast annealing procedure. On the other hand, the high concentration spectrum is partially recovered after storing the low concentration sample for several months at T = 5°C (Supplementary Figure S10). This indicates that the dimeric species is kinetically favored. Such strong kinetic bias explains the apparent destabilization observed in H-6, since the T1/2 determined by UV (37.5°C) corresponds to a kinetically trapped species detected by NMR at low concentrations. Interestingly, 1H-NMR spectra of H-6 recorded in conditions in which the dimeric species is predominant exhibit signals in the 15–16 ppm region, as expected for hemiprotonated cytosines (Supplementary Figure S11). Most probably, this structure is a parallel homo-duplex stabilized by C:CH+ base pairs (46,47). Since there is no indication of the formation of this dimeric species in the unmodified oligonucleotide, we must consider that 2′F-araC substitutions may also have an effect on the stabilization of parallel duplexes, which in this case can be considered as intermediates in the formation of i-motif structures.
19F spectra suggest that a similar effect may also occur in H-5. Therefore, the only two exceptions in Table Table11 can be explained by the formation of kinetically trapped dimeric structures. To extract more reliable thermodynamic data on the tetrameric i-motif formation, we made use of calorimetric techniques.
DSC is well-suited to detecting the presence of multiple structural populations. We performed DSC experiments on H1-H6, finding the main melting transition to occur at higher temperatures with increasing numbers of 2′F-araC modifications (Figure (Figure5).5). With the ~33-fold higher concentrations relative to UV-Vis experiments, the association kinetics were much faster, reflected in the little to no hysteresis across the sample set (Supplementary Figure S12). Interestingly, an additional low temperature transition became more prominent as the number of modifications was increased (H-3 through H-6). The DSC profile for H-6 was in greatest contrast, with a wide low temperature shoulder and the highest main transition temperature of the entire set. Positive unfolding ΔCps≈2-6 kJ mol−1 K−1were found for each structural ensemble.
We fit the DSC profiles assuming non-two-state folding behavior, finding in all cases that the extracted structural populations were in close agreement with gel and NMR data. H1-H5 were fit with a thermodynamic model assuming the presence of two tetrameric structures, while the model for H6 assumed a tetrameric and dimeric structure, in accordance with gel and NMR results. Thermodynamic parameters resulting from the DSC fits are displayed in Table Table3.3. Because certain samples had slight hysteresis (Supplementary Figure S12), we use T1/2 here as the temperature at which respective structural populations are equal to the population of the monomer. For the samples with no hysteresis, this value is the equilibrium Tm. Optimal data quality with minimal hysteresis was found with scan rates of 0.5°C/min, therefore we applied this rate across the H1-H6 dataset. The populations extracted from the fit for H-1 show a small fraction of a second tetrameric structure at low temperature. DSC populations for H2-H5 indicate the presence of minor tetramer structures at low temperature, which become the dominant populations toward the T1/2. These unfolding intermediates may be different topologies or frayed structures, such as those observed in G-quadruplexes and homoduplexes (48,49). The NMR for H-4 (discussed below) suggests that the stable structure at pH 7.0 has opened terminal bases. This might be the case for the unfolding intermediates in H-2 through H-5. H-6 populations indicate the tetrameric structure is favored at low temperature and the dimeric intermediate structure is strongly populated toward the T1/2 to give the main transition. Importantly, population of intermediate states in H2-H6 is consistent with the NMR data. The NMR spectra contain peaks which shift with increasing temperature, suggesting the initial structures at low temperature are in fast chemical exchange on the NMR timescale (50,51).
In addition to identifying structural populations and quantifying their unfolding thermodynamics, the DSC fitting allows determination of T1/2s for individual populations, as well as the global T1/2 at which 50% of the structural ensemble is denatured. To quantify the increase in thermal stability due to the modifications, we compared the increase in T1/2 for each modified structure relative to the corresponding unmodified structure (ΔT1/2 = T1/2modified–T1/2unmodified) with the number of modifications. The ΔT1/2s as a function of number of modifications are strongly correlated (R = 0.98, 0.98, 0.98), showing the T1/2s for structures 1, 2 and the global T1/2 are increased by 7.5, 5.1 and 5.2°C respectively with each 2′F-araC modification (Supplementary Figure S13).
To get more insight into the origin of the stabilization due to fluoroarabinose modifications, an in-depth structural study was undertaken using NMR spectroscopy. Complete assignment of the NMR spectra of highly repetitive sequences such as those studied here is a difficult task which is further exacerbated in this case by the occurrence of multiple folded species (in the case of the tetrameric structures). However, the presence of 2′F-araCs in the sequence improves chemical shift dispersion and greatly facilitates 2D NMR analysis. In the case of H-4, we could obtain a complete sequential assignment of the major species at pH 5.0, consistent with a 3′E intercalation topology (Supplementary Figure S14). Strong H1′-H1′ cross-peaks and amino-H2′/2″ NOEs clearly confirmed the formation of an i-motif structure. The stacking order could be determined by following H1′-H1′ connectivities along the minor groove between C2-fC5, C2-C6, C6-T1, C3-fC4 and C3-fC5. This pattern was confirmed by the amino-H2′/H2″ contacts observed along the major groove between T1-C6, C2-fC5 and C3-fC4 (Supplementary Figure S14). The exchangeable proton spectra exhibit several signals between 15–16 ppm, indicative of C:CH+ base pairs. These base pairs occur between magnetically equivalent cytosines since each of these signals exhibits cross-peaks with only two amino protons (8–10 ppm). NMR assignments are specified in Supplementary Table S3.
DQF-COSY experiments indicate that deoxyriboses adopt an N-type conformation (except C6), while 2′F-araCs adopt an S-type conformation (Supplementary Figure S15). The three dimensional structure of H-4 was calculated on the basis of 284 experimental distance constraints using restrained molecular dynamics methods, and following standard procedures previously described by our group (37). All residues are well defined, with an RMSD of 0.9 Å (Supplementary Table S4). The final AMBER energies and NOE terms are reasonably low in all the structures, with no distance constraint violation >0.3 Å. The coordinates of the final refined structures were deposited in the PDB (code: 2N89).
The structures show that 2′F-araC fits very well in the otherwise standard 3′E i-motif structure with the stacking order T1-C6-C2-fC5-C3-fC4 (Figure (Figure66 and Supplementary Figure S16). The fluorine atoms in the S-type arabinoses point toward the major groove of the i-motif without affecting sugar–sugar contacts critical for i-motif stability (Supplementary Figure S16). Geometrical parameters are shown in Supplementary Table S5.
The major i-motif species at acidic pH is not the most stable one at neutral pH (Supplementary Figure S17). NMR data indicate that the terminal deoxycytidines (C6) are not protonated and are disordered at pH 7.0. Some key NOEs, such as those shown in Supplementary Figures S18 and 19, indicate that the intercalation order is T1-fC5-C2-fC4-C3 (Figure (Figure6),6), which is different than that of the major conformer at acidic pH. Although the complete sequential assignment of the NMR spectra of the major conformer at neutral pH could not be carried out, a model structure consistent with the NOEs that could be assigned unambiguously was built and refined with molecular dynamics methods (Figure (Figure66 and Supplementary Figure S19). Similar to the conformer observed at acidic pH, the neutral pH conformer maintains a 3′E topology with one fewer C:CH+, due to deprotonation of the terminal C:CH+ base-pair. Thus, disruption of C6:C6+ base pair upon raising the pH provokes a rearrangement of the intercalation order, preserving a 3′E topology. This reflects the well-known higher stability of 3′E vs 5′E i-motif topologies (52,53).
Our groups and others have extensively studied the stabilizing effect of fluorine substitutions at the sugar C2′ position in different nucleic acid motifs (54–57). 2′F-Arabinonucleotides, when incorporated into DNA strands, increase binding affinity toward RNA through formation of non-canonical 2′F…purine (H8) hydrogen bonds (58,59). Similar interactions have been observed in pure 2′F-ANA:2′F-ANA duplexes (45). Moreover, the strong electronegativity of fluorine affects the sugar charge distribution (in particular in the H2′ proton) and can induce the formation of FC-H…O hydrogen bonds between sequential sugars. This interaction is responsible for the enhanced stability of 2′F-araG substituted G-quadruplexes (30,60) and 2′F-ANA:RNA hybrid duplexes (29). The structural analysis of H-4 did not provide evidence for FC-H…O hydrogen bonds with the appropriate geometry. However, a number of favorable sequential and inter-strand contacts involving 2′F-araC residues were observed (Figure (Figure7).7). Many of these close contacts are facilitated by the S-type conformation of the 2′F-arabinoses, and hence are absent in unmodified i-motifs in which the sugar conformations are N-type. In particular, sequential FC-H2′…O4′ and inter-strand FC-H2′…O2 distances are much shorter in S-type 2′F-arabinoses. Since fluorine electronegativity provokes a positive charge polarization at the geminal H2′ proton, its close contacts with electron-dense oxygen atoms cause favorable electrostatic interactions. Other favorable electrostatic interactions involve sequential F-C-H2′…X (where X = O3′, O5′) and inter-strand F…H2N. In addition to favorable electrostatic interactions, a long-range inductive effect may also strengthen hemi-protonated 2′F-araC:2′F-araC+ base pairs as seen in 2′F-RNA duplexes (56,61). Stronger base pairing and long-range electrostatic interactions may also explain the large increase in pH1/2 observed for 2′F-araC substituted i-motifs compared to the pKa of free 2′F-araC nucleoside.
The possibility of modulating the stability and pH dependence of i-motif structures through chemical modification has attracted much attention in recent years. Most substitutions tested thus far have led to destabilization, the exceptions being LNA and PNA for selected sequences (21,25). 2′F-araC is a rather conservative DNA-like modification which preserves the structure of natural i-motifs with only minor alterations. This makes 2′F-araC substituted sequences excellent mimics to study molecular recognition processes involving i-motifs. For example, pull-down cell-based assays that identify i-motif binding proteins (including potential i-motif-specific antibodies) will be significantly facilitated by carrying out these experiments under physiological conditions. The fluorine modification also provides an excellent handle to detect i-motif-ligand interactions via 19F NMR methods. Finally, the nuclease resistance properties conferred by 2′F-araC substitutions are important for in vivo applications of i-motif-based nanodevices (14).
We have shown that i-motifs, including tetrameric sequences and those formed by centromeric and telomeric DNA sequences, can be significantly stabilized by replacing dC units with 2′F-araC residues. The effects observed are noteworthy, resulting in some instances in stabilization of the structure by over 10 kJ/mol per 2′F-araC incorporation. Replacement of DNA with 2′F-araC does not alter the overall i-motif structure, creating the opportunity to correlate the structural information of transient arrangements of G4 and i-motifs within duplex DNA with transcription factor or polymerase activity under physiological conditions. This would bring i-motifs into focus as drug targets through the screening of small molecule libraries that bind to i-motifs (62).
This work is dedicated to the Memory of Alfredo Villasante, valuable collaborator and friend.
Funding for open access charge: NSERC Discovery grant (to M.J.D., A.K.M.); CIHR DDTP Training Grant (to H.A., R.H.V.); MINECO [BFU2014-52864-R to C.G.]; CSIC-JAE contract (to N.M.P.).
Conflict of interest statement. None declared.