|Home | About | Journals | Submit | Contact Us | Français|
Described are studies systematically exploring structural effects in he use of ethylene glycol (EG) oligomers as non-nucleotide replacements for nucleotide loops in duplex and triplex DNAs. The new structurally optimized loop replacements are more stabilizing in duplexes and triplexes than previously described EG-based linkers. A series of compounds ranging in length from tris(ethylene glycol) to octakis(ethylene glycol) are derivatized as monodimethoxytrityl ethers on one end and phosphoramidites on the other, to enable their incorporation into DNA strands by automated methods. These linker molecules span lengths ranging from 13 to 31 Å in extended conformation. They are incorporated into a series of duplex-forming and triplex-forming sequences, and the stabilities of the corresponding helixes are measured by thermal denaturation. In the duplex series, results show that the optimum linker is the one derived from heptakis(ethylene glycol), which is longer than most previous loop replacements studied. This affords a helix with greater thermal stability than one with a natural T4 loop. In the triplex series, the loop replacements were examined in four separate situations, in which the loop lies in the 5′ or 3′ orientation and the central purine target strand is short or extends beyond the loop. Results show that in all cases the loop derived from octakis(ethylene glycol) (EG8) gives the greatest stability. In the cases where the target strand is short, the EG8-linked probe strands bind with affinities in some cases greater than those with a natural pentanucleotide (T5) loop. For the cases where the target strand extends beyond the linker, the EG8-linked strands are much lower in the 5′ loop orientation than in the 3′ loop orientation. It is found that extension by one additional nucleotide in one of the bonding domains in the EG-linked series can result in considerably greater stabilities with long target strands. Overall, the data show that optimum loop replacements are longer than would be expected from simple distance analysis. The results are discussed in relation to expected lengths and geometries for double and triple helixes. The findings will be usefull in the design of synthetically modified nucleic acids for use as diagnostic probes, as biochemical tools, and as potential therapeutic agents.
There has been growing interest in recent years in the development of synthetically altered nucleic acid structure, for use in diagnostics applications, as tools in the study of gene expression and as potential therapeutic agents.1 Synthetic modification of DNA and RNA structure can lead to potentially useful properties such as increased lifetime in biological media, improved ability to be taken up into cells, or improved hybridization properties.2
Nucleic acid loops are a ubiquitous part of RNA and DNA secondary structure. Nucleotide loops arise in folded domains such as occur in intrastrand duplexes and triple helixes. There has been increasing recent interest in synthetic DNAs and RNAs which are designed to contain hairpin loops. Studies in modification of these structures have led to the development of non-nucleotide linking groups which replace several nucleotides bridging a folded duplex or triplex structure.3-12 Nonnucleotide groups have also been used as linkers in nonfolded structures as well.13-17 Such linking groups are potentially useful replacements of natural nucleotides for several reasons. First, they can shorten the synthesis of given structure by several steps, since one relatively long linking group replaces several individual nucleotides which normally constitute a loop. Second, such nonnatural loops can confer resistance to degradation by nucleases which would ordinarily act on a natural loop structure in biological applications.5 Careful design of a loop replacement also has the potential to give more stable folded structure than occurs with the natural loop nucleotides.
Several nucleic acid loop replacements have been documented in the literature.3-12 Maurizot et al. reported the use of hexakis(ethylene glycol) as a replacement for a T4 loop in duplex DNA and found that the replacement led to slightly higher thermal stability than the natural loop.3 Hélène reported the use of the same linker in a folded triplex-forming oligodeoxynucleotide, although its binding stability was not compared to that with a nucleotide loop.4 We reported the use pentakis- and hexakis(ethylene glycol) chains in the linking of a triplex-forming circular DNA molecules.5 In that preliminary study we found that the longer linker of the two was the most stabilizing in triple helical complexes, although the stability with a nucleotide loop was somewhat higher. Letsinger at al. reported a bisamide-based linker for double- and triple-helical complexes but did not test nucleotide loops for comparison.6 For loop replacements in duplex DNAs, Benight and co-workers described the use of dodecyl chains, which were found to be more stabilizing than nucleotide loops.10 In duplex RNAs, Ma et al. reported the use of tris(ethylene glycol) and hexakis(ethylene glycol) loop replacements8,9 and found the latter to be more stabilizing than the former. Similar linkers have been used recently to replace loops in catalytic RNAs;11,12 in those cases catalytic activity, rather than thermodynamic stability, was studied. Ethylene glycol oligomers, and in particular hexakis(ethylene glycol), have been the predominant structures used as such linkers. Ethylene glycols have the advantage of good water solubility, which can be a problem with hydrophobic chains,10 and ethylene glycol oligomers are available in multiple lengths and are easily derivatized in the same manner as nucleosides for incorporation into DNA.
Despite the growing number of laboratories utilizing synthetic loop replacements, there has not yet been a systematic study of simple unmodified ethylene glycol linkers of several different lengths. Such a study would be useful in evaluating the optimum lengths for such linkers, as well as in comparing stabilities with standard nucleotide loops. In an effort to characterize length and structural effects of such linkers both in duplex and triplex DNAs, we have synthesized and studied a series of modified oligodeoxynucleotides incorporating ethylene glycol oligomers containing three to eight monomer units, and thus spanning 13 to 31 Å in distance. We find that the hexakis(ethylene glycol) linker—the most commonly used loop replacement—is generally not the optimum structure for such complexes. The results of stability studies of 65 intra- and intermolecular DNA complexes are described here and are discussed in light of the structures, sequences, and geometries involved.
To test the effects of linker structure to be used as a substitution for natural nucleotide loops in DNA, we studied the six ethylene glycol-derived compounds shown in Figure 1. We abbreviate the linkers EGn, where EG represents ethylene glycol and n, the number of ethylene glycol monomers in the chain. Aside from our own work with the EG4,5 chains,5 only the EG3 and EG6 cases have been previously reported as linker groups; the other two (EG7,8) have been conjugated to the end of oligonucleotides but not linking two DNA sequences. These last two were obtained in that study by separation from a poly(ethylene glycol)-derived mixture.
In the present case we pursued a higher-yield approach to these compounds. The precursor alcohols tris-, tetrakis-, pentakis-, and hexakis(ethylene glycol) were all obtained from commercial sources. Mono-tritylated derivatives of heptakis(ethylene glycol) and octakis(ethylene glycol) were prepared by starting with mono-tritylated tetrakis(ethylene glycol), which was extended with tris- or tetrakis(ethylene glycol) by SN2 displacements of tosylates. We thus obtained the mono-trityl derivatives of all six bis-alcohols and then derivatized the remaining hydroxyl group with a phosphoramidite moiety. In this manner, all six compounds were prepared for automated DNA synthesis, and they were incorporated into DNA strands in good yields using standard coupling protocols.
Our aim was to study the effects of these loop replacements on helix stability both in double- and triple-helical DNA (see Figure 3). In the duplex DNA series we studied a hairpin-type duplex containing a six-base-pair stem bridged either by a natural T4 loop or by one of the EGn series loop replacements. The sequence is 5′-dCGAACG-loop-CGTTCG. This stem sequence was studied previously by Breslauer,22 who showed that a T4 loop was more stabilizing to the hairpin structure than the other natural loops A4, C4, and G4. Altogether we studied one natural hairpin and six with non-nucleotide loop replacements.
In the triple-helix series we studied the effects of loop structure on bimolecular complexes formed from pyrimidine-rich strands having the sequence 5′dTTCTTTTC-loop-CTTTTCTT and purine-rich target strands containing the binding site 5′-dAAGAAAAG or the reverse, 5′dGAAAAGAA.23 We also examined two closely related pyrimidine strands which contain one extra natural nucleotide (underlined): 5′-dTTCTTTTC-loop-TCTTTTCTT and 5′-dTTCTTTTCT-loop-CTTTTCTT. For all the triplex-forming pyrimidine sequences we substitued the loop with non-nucleotide EGn linkers and we compared the results to those with a natural loop of sequence-TTTTT-.
The four specific target sequences studied were 5′-dAAGAAAAG, 5′-dGAAAAGAA, 5′-dAAGAAAAGACCCCC, and 5′-dCCCCCAGAAAAGAA. These are broken down into two sets of categories: the short and extended targets, and the 5′ loop and 3′ loop targets. The short targets represent minimal binding sites for the ligands and allow loops to bridge the pyrimidine domains without interference from the central purine strand. The extended targets were designed to test specifically the interactions of a given loop/linker with the central strand as it extends outward from the complex,24 as might be expected to occur in many practical DNA-binding applications. The two 5′ loop targets are bound by the ligands in an orientation which places the loop near the 5′ end of the central target strand; conversely, 3′ loop targets orient the loop near the 3′ end of the target strand. Figure 3 shows the structures of the 12 types of complexes examined using the three types of pyrimidine-rich ligands and four types of purine-rich target sequences. In all, 48 such “hairpin triplex”-type complexes were examined using natural loops or non-nucleotide linkers.
In addition, to test the effects of the nonnatural linkers on triple helical complexes in a different sequence and structure, we synthesized four circular pyrimidine-rich oligodeoxynucleotides which contain two EGn linkers (Figure 7). These circular ligands were designed to bind a minimal purine target sequence 5′-dAAGAAAAGAAAG, as well as a longer 24-mer containing this target sequence in the center.5
Two important structural features to be considered in the design of loop replacements in DNA are the distances being spanned by the linking group and the geometries required to bridge from one side to the other. In the length analysis one would prefer a linker which is just long enough to span the required distance; a bridge which is too short will cause enthalpic strain in the complex, while one which is too long will be less than optimum from an entropic standpoint.
In the present linker structure, a bridging group carries its own 3′-phosphate group (as do the natural nucleotides in automated DNA synthesis) and then presents its own 5′-OH group to be linked to the 3′-phosphate on the next natural nucleotide. One can use models of duplex and triplex nucleic acids to examine the 5′-O to 5′-O distances involved in a bridge. Figure 2 shows these distances plotted for a B-DNA duplex, an A-RNA duplex, and an A’-type triple helix; the distances are shown from a specific starting point (marked *) on one chain to the various possible linked positions in 5′ and 3′ directions on the other strand. In the triplex case, a linker may in principle either bridge straight across a base triad or bridge in diagonal fashion between nucleotides not in the same triad.
For double-helical structures, results of this distance analysis25,26 show that the bridiging distance from one nucleotide to its pairing partner is about 15 Å and is almost the same for either DNA or RNA. Because of the right-handed turn of the helixes, the actual positions of closest approach to the * position on one strand are those two or three nucleotides in the 3′ direction on the opposite strand in DNA and one nucleotide in the 3′ direction in an RNA duplex. In the present case we are concerned initially with loops which span a blunt-end duplex with no overhanging nucleotides (i.e., from * to position 0).
For DNA triplexes, distance analysis shows that at the *-to-0 step (a blunt-end triplex), the distance is about 19 Å, or about 4 Å longer than the corresponding duplex distances. The actual point of closest approach to the * position is the one on the opposite strand four of five nucleotides in the 5′ direction.
The linker lengths in the EGn series range from 13 to 31 Å in fully extended conformation (Figure 1). Thus, in blunt-ended DNA or RNA duplex, one might expect a linker of length ≥ 15 Å (corresponding to EG4 or EG5) to be ideal and the EG6-EG8 linkers to be perhaps overly long. In the blunt-end triplex case, the EG3 and EG4 linkers would be predicted to be too short, the EG5 and EG6 linkers to be nearly optimum, and the EG7 and EG8 linkers to be overly long. However, this analysis does not take into account the starting vectors of the groups being linked, which may not be aimed directly at each other; thus, longer lengths might be required to reorient the chain and aid in reversing the chain direction from one strand to the other. In addition, there may be steric barriers between the linked points that must be bypassed. Thus, it would not be surprising if longer lengths were required than predicted from simple distance analysis.
The analysis does indicate that one may expect optimum linker lengths to be similar for RNA and DNA duplexes; in addition, triple-helical structures are likely to require about 4–5 Å of additional length for an optimum bridge. For the EGn series, this 4–5 Å of length corresponds to one to two additional ethylene glycol units relative to the optimum duplex loop replacement. One additional structural aspect to take into consideration in triple-helical structures is the interaction of the loop/linker with the central purine target strand as it extends past the complex. We have previously shown that such interactions can be destabilizing,24 and therefore, longer linkers might aid in passing around the steric bulk of this central strand. Alternatively, alteration of the linking geometry might aid in avoiding this problem (see below).
Six EGn-linked duplexes were studied at pH 7.0 in the presence of 100 mM NaCl and 10 mM MgCl2. Stabilities of the duplexes were measured by thermal denaturation experiments monitored at 260 nm in a UV-vis spectrophotometer. Melting temperature (Tm) values were determined from the first derivative of the temperature versus absorbance data, and the free energies were determined by nonlinear least squares fitting of the data to a two-state model with linear sloping baselines. We studied the six duplexes over a concentration range from 0.2 to 10 μM, and the Tm values were all independent of concentration over this range, confirming the hairpin-type unimolecular structure of the duplexes.
Figure 4A shows representative denaturation curves for these six complexes, which contain the EG3 through EG8 linkers. Examination of the curves shows that the duplexes with the four shortest linkers are similar in appearance but with the curves shifting toward higher temperatures as the linkers get longer. The duplexes with the EG7 and EG8 linkers show curves with relatively high Tm’s but with lower cooperativity and hyperchromicity. Table 1 displays the numerical data, some of which is plotted in Figure 5. The plot of Tm values as a function of linker length shows that the Tm’s rise from a low of 62.5 °C with the shortest loop to 74.9 °C with the heptakis(ethylene glycol)-derived loop. This is the point of highest thermal stability; on lengthening the chain further to EG8, the thermal stability then drops somewhat. The Tm values in this case offer both the most accurate and precise measurement of stabilities; individual Tm values for a given complex generally fall within 0.5–1.0 °C of each other, and our reported values are averaged from repeated experiments with standard deviations generally less than ±0.5 °C. Examination of free energy data (calculated at 37 °C) shows that it follows a similar profile, but with a maximum at EG5 and EG6, or shorter than the Tm data would indicate. The free energy numbers, however, are considerably less accurate because they are extrapolated from melting which occurs at much higher temperatures. In addition, the precision for ΔG° is less, at ±5–10%. It is therefore difficult to judge the stabilities on the basis of these free energy measurements. At higher temperatures (60–70 °C), the free energy data agree with the Tm optimum at the EG7-linked structure.
Thus, the heptakis(ethylene glycol) linker gives the highest thermal stability of the series for this DNA duplex. It would appear that, if higher thermal stability is desired, then the EG7 linker may be superior to the commonly-used EG6 linker for DNA duplexes. Although we have not studied RNA duplexes with this linker series, the geometric analysis leads to the prediction that heptakis(ethylene glycol) may be the optimum length for RNA as well. It is interesting that this optimum (a linker length of 27.4 Å) exceeds the minimal straight-line distance being bridged in the DNA duplex by 12 Å. This is apparently the extra length required to “turn the corner”, that is, reverse the strand orientation by ~180° and avoid any repulsive steric interactions as well.
Since the EGn linkers are used as loop replacements, it is reasonable to compare the stabilities of those hairpins with one bridged by a natural T4 loop. The T4 loop in this context has been shown to be the most table of the natural homo-tetraloops,22 and four-nucleotide loops are thought to be the optimum length for natural loops in duplex DNA.27
The results show that the EG7-linked hairpin is significantly more thermally stable than the hairpin with the T4 loop. Under the same conditions, the latter has a Tm of 67.8 °C, or 7 °C lower than with the optimum non-nucleotide linker. In fact, all the linkers from EG5 through EG8 are more stabilizing than the natural loop in this sequence context. It is not yet clear why this is the case; the high flexibility of the ethylene glycol chains would normally be expected to entropically disfavor stability relative to more rigid nucleotide loops, although it is possible that the EG chain retains some entropic freedom even in the complex.
To our knowledge, only one previous study has reported the use of ethylene glycol oligomers in duplex DNA.3 In that case, hexakis(ethylene glycol) was studied and the results were compared with the same sequence bridged by a T4 loop. That study found a 3 °C increase in Tm with the non-nucleotide loop. Our findings are therefore consistent with that result.
Two studies have focused on the effects of ethyelene glycol and related linkers on the stability of RNA duplexes.8,9 In those studies it was found that a hexakis(ethylene glycol) linker was more thermally stabilizing than a short tris(ethylene glycol) linker; comparison with nucleotide loops found the EG6 linker to be more stabilizing than a hexanucleotide loop in one case, cut less stabilizing than a tetranucleotide loop in another. In this case our results are not directly comparable, since our study does not involve RNA helixes.
One previous study described the use of dodecyl chains, rather than ethylene glycols, as loop replacements in duplex DNA structure;10 comparisons to a tetranucleotide loop showed that the non-nucleotide linkers were substantially more stabilizing. The authors suggested, based on model building, that hexakis(ethylene glycol) may be too long to be optimum as a loop replacement. Interestingly, we find that this is not the case; as noted above, an extra ~12 Å in length beyond the modeled bridging length is apparently needed for highest stability. It remains to be seen whether longer alkyl chains would give greater loop stability; however, hexadecyl chains have been reported to cause solubility problems when attached to DNA.10
There has been considerable and growing interest recently in the use of triplex-forming oligonucleotides to bind to single-stranded DNA29 and RNA28i,28m,29 by triple-helix formation. This new general strategy often involves the construction of an oligonucleotide which contains two binding domains—Watson—Crick and Hoogsteen—bridged by one or two loops. In a pyrimidine·purine·pyrimidine triplex, this involves linking two pyrimidine domains by a loop and using the resulting ligand to bind a purine-rich complementary site.
We have shown that, using natural nucleotides, the optimum loop length for such a triplex appears to be five nucleotides,28c or one longer than optimum for duplex. In a preliminary study,5 we have also replaced the two pentanucleotide loops in a circular triplex-forming ligand with pentakis- and hexakis(ethylene glycol) linkers, and it was found that the EG6-linked case gave higher stability in binding a minimal target sequence than the EG5-linked case, and both of these were somewhat less stable than the complex with a cyclic ligand bridged by pentanucleotide loops. Additional experiments showed that the use of the non-nucleotide linkers conferred higher resistance to degradation by nuclease enzymes.
In the present study we wished to explore a wide range of linker lengths, to additionally explore any effects of 5′ or 3′ loop orientation, and finally to investigate any interactions between the loops and an extended target strand. We therefore studied binding to four target sequences, both short and extended, and using both 5′ and 3′ loop orientations, and tested the effect of linker length on these complexes. We initially studied the series using the sequence-symmetrical probe strand 5′-dTTCTTTTC-loop-CTTTTCTT. For comparison we also measured the stability of the triplex using the same sequence bridged by a T5 loop.
The effect of varied linker length was first examined with minimal eight-base targets in the 5′ and 3′ loop orientations. The thermal denaturation data for the bimolecular complexes are listed in Table 2 and are plotted graphically in Figure 6. The experiments were carried out at pH 7.0, with 100 mM NaCl and 10 mM MgCl2. Figure 4B also shows representative melting plots for one set from the EGn series in the 5′ loop orientation with the minimal target. The curves indicate that the complexes are well-behaved and give apparently two-state, all-or-none denaturation. All of the complexes have similar hyperchromicity and curve shape.
Results with the short targets show a significant dependence on linker length, with Tm values ranging from 17.7 to 23.5 °C in the 5′ loop series. The results for the 3′ loop series are experimentally indistinguishable from those with the 5′ loop orientation. In both cases, the Tm increases monotonically with increasing linker length and is highest at the EG8-linked point. Examination of the overall trends (Figure 6A) shows that, although the series has not passed beyond an optimum, it appears to be approaching it with only small improvements evident from EG7 to EG8. The Tm differences are statistically significant even in this smaller last step. The free energy values for the EG6 through EG8 complexes fall within experimental error of each other and so are not helpful in distinguishing differences.
Thus, it would appear that in general, with blunt-ended triplexes, the linker which gives highest thermal stability in this sereis is the octakis(ethylene glycol)-derived linker. Interestingly, we cannot rule out the possibility that a nonakis(ethylene glycol) linker may be slightly more stabilizing, although the trend approaching this length would indicate that this longer loop is not likely to give large improvements. The data also show that there is no significant difference in loop stabilities for the 5′ and 3′ loop orientations.
As seen for the duplex series above, the optimum linker is considerably longer than straight-line distance analysis would indicate. The EG8 linker has a length of 30.9 Å, or about 12 Å longer than the distance being bridged. It is interesting to note that this is precisely the extra length which is optimum for the duplex linkers and which was ascribed to the length need to reverse the strand orientation. It does seem, as predicted by the distance analysis, that the optimum for triplex linkers is indeed one to two ethylene glycol units longer than that for duplex.
In most practical applications, such triplex-forming ligands would bind to purine-rich target sites within a much longer strand of DNA. To test the effect of interactions between loops or loop replacements and the target strand as it passes beyond the loop, we studied two longer target sequences which contain six extra nucleotides, an A followed by five C residues. In one case this extension is placed at the 5′ end of the target site, and in the other, the 3′ end. This allows for the testing of loop interactions in either loop orientation.23 The same EGn-looped ligand series was hybridized to these two longer targets and evaluated again by thermal denaturation.
The data from these experiments are listed in Table 3 and are plotted graphically in Figure 6B. Results show that the stabilities depend strongly on which loop orientation is being studied: the range of Tm values in the 5′ orientation is 11.1–14.5 °C, whereas in the 3′ orientation the range is higher overall, 15.6–20.0 °C. Thus, the least thermostable 3′ loop is more stable than the strongest 5′-type loop; in addition, both cases show lower thermal stability than do complexes with the minimal eight-base binding sites (compare parts A and B of Figure 6). From these two observations, we can conclude that there are destabilizing interactions between the loop and the central strand and that the magnitude of the interaction depends on the strand orientation relative to the loop.
Also interesting are the overall trends as a function of length for the two orientations (Figure 6B): for the 3′ loop series, the Tm data increases monotonically with length to the EG8-linked compound, as was seen with the short target strands. However, with the 5′-type loops, the Tm actually drops initially with the longer target, reaching a low at the hexakis(ethylene glycol) point, and then it increases again, to the highest point at the EG8 case. Again, these effects are statistically significant, since the standard deviations for the Tm values are generally less than ±0.5 °C, much smaller than the magnitude of the actual variation seen.
These effects are likely to be caused by steric interactions between the linker/loop and the target strand. Models show that, in a DNA triple helix, a straight line drawn from one pyrimidine backbone to the other one (at the same base step) must pass through the steric space of the central purine strand and that the specific group involved seems to be the next base beyond the target site in the central strand. Such a steric interference would explain the lower Tm with the longer target strands for both loop orientations; in addition, since the two loop orientations cause differences in the local structure, this might explain why the Tm behavior as a function of length is different in the two cases. Examination of the models does not suggest an obvious explanation for the bimodal behavior seen for the 5′ loop series, and so it is not yet clear what causes this effect.
In any case, we find that the optimum loop for bridging triplexes in either orientation is the octakis (ethylene glycol)-derived loop. We cannot rule out the possibility in this case that longer loops might give yet higher stability; indeed, this might be expected since the loop may have to travel a greater distance to avoid destabilizing steric interactions with the extended central strand. In addition, we find that the commonly-used linker derived from hexakis(ethylene glycol) is a relatively poor choice, especially if the 5′ loop orientation is desired. Finally, we conclude that, with extended target strands, all the loop replacements in this series are significantly destabilized by interaction with the central strand.
Aside from our preliminary report,5 few reports exist in the literature on the use of non-nucleotide linkers in triplex DNA. Two groups have used hexakis(ethylene glycol) linkers bridging pyrimidine strands in triple helixes,4,7 but neither of these studies evaluated other linker lengths or compared the stabilities to those with nucleotide loops. A study of Letsinger et al. reported the use of a bis-amide linker to bridge triplex structure,6 but the stability relative to that with a nucleotide loop was not evaluated.
The comparison of our loop replacement series with an optimized T5 nucleotide loop23 in the same triplexes reveals that, although none of the above synthetic replacements leads to stabilization relative to the natural-looped case, the best replacements approach this value. When the pyrimidine probe strand dTTCTTTTC-T5-CTTTTCTT is hybridized to the two short targets under identical conditions, we find Tm values of 25.2 and 24.4 °C for the 5′ and 3′ loops, respectively, and corresponding free energies (37 °) of −5.3 and −5.7 kcal/mol. Thus, this natural loop gives only slightly higher thermal stability (by 1.7 and 1.1 ° for 5′ and 3′ loops) than is shown by the most stable EG8-linked compounds.
In the case where the target strand is extended, the difference between natural and nonnatural loops becomes more pronounced. The Tm values for binding of the T5-looped sequence to the longer target strands is actually higher than with the short targets. The Tm values in the extended case are 29.5 °C in the 5′ loop orientation and 26.3 °C in the 3′ loop orientation. These values are 4.3 and 1.9 °C higher than those for the same ligand with the short targets; thus, the interaction between the natural loop and extended strand bases is a stabilizing one. As seen above, however, the corresponding interaction with the non-nucleotide linkers is significantly destabilizing. The Tm difference between the T5-looped ligand and the optimized EG8-linked oligomer with these longer target strands is 15.2 °C in the 5′ loop orientation and 6.3 °C in the 3′ orientation.
Possible explanations for this difference lie in structural differences between these natural and nonnatural linkers. First, the natural pentanucleotide loop may be longer; this structure could span as long a distance as 33–34 Å in an extended conformation, while the octakis(ethylene glycol) linker spans a shorter ~31 Å. Second, the nucleotide backbone is more rigid than the ethylene glycol chain, which favors the former entropically in binding. Third, the nucleotide loop is likely to benefit significantly from favorable stacking interactions involving the loop bases. Finally, in the natural loop, there are probably stabilizing base pairing interactions between the first/last loop bases and the base adjacent to the minimal binding site.24
To test the effects of structural variation in the EGn series in a second triplex system, we incorporated these linkers as loop replacements in circular triplex-forming oligonucleotides. These ligands consist of two pyrimidine-rich binding domains linked at both ends by two separate loops or loop replacements, and they are designed to bind purine-rich targets by sandwiching the target between the two pyrimidine domains. A preliminary study with the EG5- and EG6-linked circles in binding short 12-mer targets showed the latter to bind with higher affinity.5 In the present study, we wished to study additionally the EG7- and EG8-linked cases and to test the ability to bind longer targets as well. The data for these experiments, carried out at pH 7.0 with 100 mM NaCl and 10 nM MgCl2, are presented in Table 4 and graphically in Figure 7.
Comparison of linkers of varied length in this new structure shows length effects which parallel those seen for the hairpin-type triplex studies. Results in the binding of the minimal 12-base targets show that binding affinity increases with increasing linker length up to the maximum length of the EG8-linked compound. Tm values vary strongly from 37.3 °C for the EG5-doubly-linked case to 53.7 °C for the EG8-linked case. The differences in free energy are also substantial and in this case are statistically significant. We find that free energies vary correspondingly from −8.8 to −17.09 kcal/mol (37 °C) from the shortest to longest chain. This is a similar result to the observations made in the singly-linked hairpin-type triplex-forming ligands, in which there was a preference for the longest loop replacement. Interestingly, a version of this cyclic ligand bridged by pentanucleotide loops binds with somewhat lower affinity, with a Tm of 52.2 °C and a free energy (37 °C) of −14.1 kcal/mol. Thus, in this case, the best loop replacements are superior to an optimized five-nucleotide loop, and this is the first example in triplex DNA wherein loop replacements have outperformed natural loops.
When the same ligands are hybridized to the same binding site embedded in a longer 24-mer sequence, the same length-dependent trend is seen, with a preference for the longest EG8 linker. Again, as seen for the previous triplex-forming EG-linked ligands, the overall stabilities are lower than with the short target sequence. Thus, this destabilization by unfavorable loop-central strand interaction is consistently found in all cases with non-nucleotide linkers.
The above results indicate that there is a general destabilizing interaction between the non-nucleotide loop replacements and the central strand as it passes beyond the loop. Examination of models of triplex structures suggests a strategy to alleviate such an unfavorable interaction: extension of one of the two binding domains at the end of the triplex near the loop, so that one strand is lengthened in the 5′ direction. In such a case, the linker bridges two nucleotides not in the same base step, but instead in adjacent (or further) steps. The distance analysis (Figure 2) shows that such a 5′ extension actually brings the two groups being linked closer together than when they are in the same base step, from a *-to-position-0 link (18.6 Å) to a *-to-position-−1 link (17.1 Å). In addition, the models show that, because of the right-handed turn of the helix, this extension moves a straight-line linking group further away from the interfering central strand.
To test this possibility we constructed several additional hairpin-type ligands targeted to the sequences dAAGAAAAG or dGAAAAGAA. These differ from the previous ligands by having one extra nucleotide (a T) as an extension adjacent to the loop. An EGn-linked series, where n = 4–8, was constructed with a 5′ extension; when binding in 5′ loop orientation, this makes the extension part of the Watson—Crick binding domain, and in the 3′ loop orientation, the extension is part of the Hoogsteen domain. In addition, one ligand with a 3′ extension was constructed having an EG6-derived linker; this reverses the position of the extension with respect to loop orientation. These new ligands were hybridized to the same short and long target sequences as was previously done. The results are tabulated in Tables Tables22 and and33 and are presented graphically in Figure 8, where they are compared with results with nonextended ligands.
Results with the 5′-extended and 3′-extended ligands with the minimal 8-mer targets show that lengthening one strand by one base on either side of the loop increases the stability of all complexes up to the longest, the EG8-linked case (see Figure 8A,B). With this last exception, the Tm values are higher with the two extended ligand types than they are with the blunt, nonextended ligand. For these short targets, the behavior is almost identical regardless of 5′ or 3′ loop orientation (compar parts A and B of Figure 8). With the 5′-extended series, the dependence on linker length is small relative to the nonextended series, and the stability peaks broadly at the EG5–7 cases and then begins to drop. As pointed out previously, the nonextended ligands markedly increase their binding affinity with increasing loop length all the way to the EG8-linked case. One 3′-extended ligand was tested for comparison, and results show (Figure 8A,B) that this extension is for both loop orientations less stabilizing than the 5′ extension.
Results are considerably different when the extended ligands are hybridized to longer strands which interact with the loops. Figure 8C,D shows these results, again compared to the binding of the nonextended (blunt-ended) ligands. As discussed above, for the 5′ loop orientation, the blunt-ended ligands bind very poorly. The 5′-extended ligands, by comparison, all bind more tightly, with Tm values 2–5 °C higher depending on length. The strongest binding case is that with the EG8-derived linker. Interestingly, the 3′-extended ligand binds better than the 5′-extended one at the EG6 case (Figure 8C). This is the only example where the 3′ extension has better success than the 5′-extended version, although it remains to be seen what the effect of longer linkers would be in this case.
Finally, for the 3′ loop orientation, we find that the 5′-extended ligands are quite successful at stabilizing the complexes relative to the unextended ligands. Results show that they bind the longer target even more strongly than they do the short target (by a margin of about 1 °C in Tm), indicating that there are new favorable interactions with the central strand. The Tm versus length profile is very flat, with a broad maximum from EG5 to EG7. Comparison to the unextended ligands shows that the extension adds considerable affinity, with Tm values increasing by 4–8 °C. In this case we see that the 3′-extended ligand binds less strongly than the 5′-extended ones do.
In general, then, these single-nucleotide extensions allow for considerable alleviation of the strained interactions between the non-nucleotide linkers and the longer target strands. The general finding of increased stabilities by extension of one strand supports the argument that the extended nucleotide undergoes pairing interactions with the purine strand, rather than simply serving as part of the linker. Although the affinities do not quite rise to the level of optimized pentanucleotide loops, this strategy represents a considerable improvement over the use of blunt-end ligands bridged by non-nucleotide linkers.
With short target sequences which do not extend beyond the triplex, the findings with the ligands which are extended by one nucleotide in the 5′ direction near the loop can be summarized as follows: the extension gives some added benefit with suboptimum linkers, but simple use of the longest (octakis(ethylene glycol)) linker makes this unnecessary. Loop orientation makes little or no difference. In general, we can therefore recommend the octakis(ethylene glycol) linker as the best available non-nucleotide loop with short targets. In some cases this non-nucleotide loop gives complexes more stable than those with nucleotide loops.
If the target site is found within a longer strand of DNA, the findings can be summarized as follows: if the loop is to be oriented to the 3′ end of such a triple-helical complex, then the best loop design (of those studied here) will include an added base at the 5′ end of the Watson-Crick domain next to the loop and will be linked by EG5, EG6, or EG7 loop replacements. The resulting affinity can approach that of an optimized pentanucleotide loop. However, if a 5′ loop orientation is necessary and a non-nucleotide loop replacement is to be used, then considerably lower binding affinity will result. The best such design in this case will be a ligand which is extended in the 5′ direction in the Hoogsteen domain near the loop, with a linker derived from octakis(ethylene glycol). However, an optimized pentanucleotide loop24 will give considerably higher binding affinity.
Hairpin-type triplex-forming ligands which are closed on one end by a loop can be designed to bind a target with the loop in either 5′ or 3′ orientation.23 Our results show that, if one is using non-nucleotide linkers, there is a clear binding advantage to ligands with the 3′ loop orientation. The results seem to indicate that this may be due to a greater steric interference at the 5′ end of the triplex, although more detailed structural studies will be necessary to confirm this. One possible way to alleviate this problem (aside from the use of natural nucleotide loops) may be the use of linkers which are yet longer than those in this study.
It should be noted that, considerations of binding affinity aside, non-nucleotide loop replacements have other potentially valuable properties. They shorten the synthesis of a ligand quite substantially; for example, replacement in a duplex gives a sequence three steps shorter than would be the case for a commonly-used tetranucleotide loop. In triplexes, where pentanucleotide loops are commonly used, such a replacement shortens the synthesis by four steps each time it occurs in a strand. In addition, since ethylene glycol oligomers are relatively inexpensive, they may cost less than nucleotides even on a one-for-one basis. Finally, such non-nucleotide loops can increase the resistance of an oligonucleotide to degradation in biological media.5
It is clear that non-nucleotide linkers can be efficient replacements for natural nucleic acid loops, but it is equally clear that the stability of the folded helical structures depends strongly on linker length and structure. The following summarizes the conclusions from this study:
We anticipate that these findings will be useful in the design of modified DNA structures, with significant practical applications possible in biological systems.
1H and 13C NMR spectra were obtained with a GE 300 MHz instrument, and chemical shifts are reported in ppm on the δ scale with the solvent as an internal reference. For 31P NMR spectra an external reference of 85% phosphoric acid was used. Column chromatography was performed with EM Science silica gel 60 230–400 mesh. Thin layer chromatography (TLC) was performed on silica gel 60 (Merck) F-254 precoated 0.25 mm plates. Mass spectral analyses were performed by University of California at Riverside Mass Spectrometry Facility.
Dichloromethane and triethylamine were dried by distillation from calcium hydride, and pyridine was distilled from barium oxide. Pentakis(ethylene glycol) was obtained from Lancaster Synthesis. All other solvents and chemicals were purchased from Aldrich, Sigma, J. T. Baker, or Fisher unless otherwise noted.
The phosphoramidite derivatives of (dimethoxytrityl)tetrakis-, (dimethoxytrityl)pentakis-, and (dimethoxytrityl)hexakis(ethylene glycol) were prepared as described previously.5 (Dimethoxytrityl)tris(ethylene glycol) 2-(cyanoethyl) N,N-diisopropylphosphoramidite was obtained from Glen Research.
Tetrakis(ethylene glycol) (40 mmol) was coevaporated twice with anhydrous pyridine (10 mL), and the residue was dissolved in dry CH2Cl2 (10 mL). To this were added dry Et3N (14.4 mmol) and 4-(N,N-dimethylamino)-pyridine (DMAP) (0.40 mmol). Next, 2.71 g of 4,4′-dimethoxytrityl chloride (DMTCI) (8.9 mmol) was added in small portions under argon. The mixture was stirred at room temperature for 1–3 h and monitored by TLC (EtOAc). The solution was diluted with CH2Cl2 (100 mL) and washed twice with 5% NaHCO3 (50 mL) and once with H2O (50 mL). The organic phase was dried over anhydrous sodium sulfate, filtered, and concentrated to an oily residue under reduced pressure. This residue was purified by chromatography on silica gel. The column was equilibrated with EtOAc containing 1% Et3N and the product eluted with EtOAc. The product was obtained as a pale yellow oil in 63% yield (2.49 g, 5.0 mmol): Rf 0.35 (EtOAc); 1H NMR (CDCl3, ppm) 2.55 (br s, 1H, OH), 3.25 (t, 2H, J = 5 Hz, DMTOCH2), 3.60–3.63 (m, 2H, CH2CH2OH), 3.67–3.78 (m, 12H, OCH2CH2O), 3.80 (s, 6H, OCH3), 6.82–6.85 (m, 4H, arom H), 7.21–7.49 (m, 9H, arom H); 13C NMR (CDCl3, ppm) 55.8, 62.4, 63.8, 71.1, 71.4, 71.7, 73.1, 86.6, 113.7, 127.3, 128.4, 128.7, 128.9, 130.7, 131.0, 131.1, 137.0, 145.7, 159.0.
The monotritylate tetrakis(ethylene glycol) (5.0 mmol) was dissolved in CHCl3 (5 mL) which had been filtered through a plug of alumina. To this was added dry pyridine (10.0 mmol) and DMAP (0.25 mmol). The mixture was cooled in an ice bath (0 °C) and p-toluenesulfonyl chloride (7.52 mmol) was added in small portions under argon with constant stirring over a period of 1 h. After 2.5 h, the reaction had reached completion (monitored by TLC). The solution was diluted with CH2Cl2 (50 mL) and washed twice with 5% NaHCO3 (40 mL) and once with H2O (40 mL). The organic phase was dried over anhydrous sodium sulfate, filtered, and concentrated to an oily residue under reduced pressure. This residue was purified by flash chromatography on silica gel. The column was equilibrated with EtOAc: hexanes (1:1) containing 1% Et3N and the product eluted with EtOAC: hexanes (1:1). The product was obtained as a yellow oil in 85% yield (2.77 g, 4.3 mmol); Rf 0.31 (EtOAc:hexanes, 1:1); 1H NMR (CDCl3, ppm) 2.45 (s, 3H, ArCH3), 3.24 (t, 2H, J = 5 Hz, DMTOCH2), 3.61–3.75 (m, 12H, OCH2CH2O), 3.80 (s, 6H, OCH3, 4.15 (t, 2H, J = 5 Hz, CH2OTs), 6.82–6.85 (m, 4H, arom H), 7.23–7.49 (m, 11H, arom H), 7.79–7.82 (m, 2H, arom H); 13C NMR (CDCl3, ppm) 21.3, 54.9, 62.9, 68.4, 69.0, 70.3, 70.5, 85.7, 112.8, 126.4, 127.5, 127.6, 127.9, 129.6, 129.8, 132.9, 136.1, 144.5, 144.9, 158.2.
(4,4′-Dimethoxytrityl)tetrakis(ethylene glycol) p-toluenesulfonate (0.513 mmol) was lyophilized from anhydrous dioxane (2.5 mL) and then dissolved in anhydrous DMF (5.0 mL) and HMPA (0.45 mL). To this was added tris(ethylene glycol) 2.57 mmol) followed by Cs2CO3 (0.565 mmol) and NaH (0.565 mmol). The mixture was stirred for 40 h at room temperature under an argon atmosphere. The solution was diluted with CH2Cl2 (30 mL) and washed three times with H2O (30 mL),twice with 5% NaHCO3 (30 mL), and then once with saturated NaCl solution (30 mL). The organic phase was dried over anhydrous sodium sulfate, filtered, and concentrated to an oil under reduced pressure. This residue was purified by flash chromatography on silica gel using methanol: EtOAc (1:9) as the eluent. The product was obtained as an oil in 42% yield (137 mg, 0.218 mmol): Rf0.47 (methanol:EtOAc, 1:9); 1H NMR (CDCl3, ppm) 2.98 (br s, 1H, OH), 3.23 (t, 2H, J = 5 Hz, DMTOCH2), 3.57–3.60 (m, 2H, CH2CH2OH), 3.64–3.68 (m, 24H, OCH2CH2O), 3.77 (s, 6H, OCH3), 6.80–6.83 (m, 4H, arom H ortho of OCH3), 7.20–7.48 (m, 9H, arom H); 13C NMR (CDCl3, ppm) 54.9, 61.3, 62.8, 70.0, 70.2, 70.3, 70.4, 70.6, 72.2, 85.6, 112.7, 126.3, 127.4, 127.9, 129.7, 136.0, 144.8, 158.1.
This compound was synthesized in a manner similar to that for formation of (4,4′-dimethoxytrityl)heptakis(ethylene glycol) above. The coupling of (4,4′-dimethoxytrityl)tetrakis(ethylene glycol) p-toluenesulfonate (513 mmol) with tetrakis (ethylene glycol) 2.57 mmol) followed by purification with silica gel chromatography (methanol:EtOAc, 3:97) gave the product as an oil in 51% yield (0.176 g, 0.262 mmol): Rf 0.43 (methanol: EtOAc, 1:9); 1H NMR (CDCl3, ppm) 2.68 (br s, 1H, OH), 3.24 (t, 2H, J = 5 Hz, DMTOCH2), 3.62–3.73 (m, 30H, OCH2CH2O, CH2CH2-OH), 3.81 (s, 6H, OCH3), 6.82–6.85 (m, 4H, arom H), 7.20–7.49 (m, 9H, arom H); 13C NMR (CDCl3, ppm) 54.8, 61.2, 62.9, 69.9, 70.2, 70.4, 72.4, 85.6, 112.7, 126.3, 127.4, 127.9, 129.7, 136.0, 144.8, 158.1.
To (4,4′-dimethoxytrityl)heptakis(ethylene glycol) (0.318 mmol) dissolved in dry CH2Cl2(0.75 mL) were added N,N-diisopropylethylamine (DIPEA) (1.27 mmol) and 2-cyanoethyl N,N-diisopropylchlorophosphoramidite (0.477 mmol) via syringe. The mixture was stirred at room temperature for 20 min under argon. The solution was diluted with EtOAc (10 mL) and Et3N (0.5 mL) and washed twice with 10% Na2CO2 (8 mL) and then twice with saturated NaCl solution (8 mL). The organic layer was dried over anhydrous sodium sulfate, filtered, and concentrated under reduced pressure. The residue was purified by flash chromatography on silica gel using EtOAc: hexanes (3:1). The product was obtained as a clear oil in 67% yield (0.213 mmol): Rf 0.37 (EtOAc:hexanes, 3:1); 1H NMR (CDCl3, ppm) 1.15–1.19 (2d, 12 H, CH(CH3)2), 2.63 (t, 2H, J = 6 Hz, CH2CN), 3.21 (t, 2H, J = 5 Hz, DMTOCH2), 3.56–3.86 (m, 40 H, OCH3), OCH2CH2O, CH2OP, POCH2CH2CN, NCH(CH3)2), 6.79–6.82 (m, 4H, arom H), 7.20–7.48 (m, 9H, arom H); 13C NMR (CDCl3, ppm) 20.0, 20.1, 24.3, 42.7, 42.9, 54.9, 58.1, 58.4, 62.2, 62.4, 62.9, 70.3, 70.4, 70.7, 70.9, 71.0, 85.6, 112.7, 126.3, 127.4, 127.9, 129.8,136.1, 144.8, 158.1; 31P NMR (CDCl3, ppm) 149.1; HRFAB (CHCl3/NBA/PPG matrix) (M - H)+ calcd for C44H65N2O11P 827.4251, found 827.4245.
This compound was prepared in a manner similar to that for formation of the (dimethoxytrityl)heptakis(ethylene glycol) phosphoramidite above. The reaction of (4,4′-dimethoxytrityl)octakis(ethylene glycol) (0.297 mmol) with 2-(cyanoethyl), N,N-diisopropylchlorophosphoramidite (0.446 mmol) followed by flash chromatography on silica gel using EtOAc:hexanes (3:1) gave the product as an oil in 80% yield (0.238 mmol): Rf 0.32 (EtOAc: hexanes, 3:1); 1H NMR (CDCl3, ppm); 13C NMR (CDCl3, ppm) 1.18–1.22 (2d, 12 H, CH(CH3)2), 2.66 (t, 2H, J = 6 Hz, CH2CN), 3.24 (t, 2H, J = 5 Hz, DMTOCH2), 3.59–3.89 (m, 40 H, OCH3, OCH2CH2O, CH2OP, POCH2CH2CN, NCH(CH3)2), 6.82–6.85 (m, 4H, arom H), 7.23–7.51 (m, 9H, arom H); 13C NMR (CDCl3, ppm) 20.0, 20.1, 24.2, 24.3, 24.4, 42.7, 42.9, 54.9, 58.1, 58.4, 62.2, 62.4, 62.9, 70.29, 70.36, 70.42, 70.46, 70.68, 70.9, 71.0, 85.6, 112.7, 126.3, 127.4, 127.9, 129.8, 136.1, 144.8, 158.1; 31P NMR (CDCl3, ppm) 149.1; HRFAB (CHCl3/NBA/PPG matrix) (M - H)+ calcd for C46H68N2O12P 871.4513, found 871.4548.
DNA oligonucleotides were synthesized on an Applied Biosystems 392 synthesizer using standard β-cyanoethyl phosphoramidite chemistry.19 Stepwise coupling yields for the non-natural residues were all greater than 95% as determined by trityl cation monitoring. Oligomers were purified by preparative 20% denaturing polyacrylamide gel electroporesis, isolated by the crush and soak method, and quantitated by absorbance at 260 nm. Molar extinction coefficients were calculataed by the nearest neighbor method.20 Oligodeoxynucleotides were obtained after purification as the sodium salt.
For the bimolecular complexes, solutions for the thermal denaturation studies contained a one-to-one ratio of two complementary oligomers. Concentrations for given experiments are listed in the text and figure legends. The buffer used was 100 mM NaCl, 10 mM MgCl2, and 10 mM Na·PIPES (1,4-piperazinebis(ethanesulfonate)). The buffer pH is that of a 2× stock solution at 25 °C containing he buffer and salts. After the solutions were prepared, they were heated to 90 °C and allowed to cool slowly to room temperature prior to the melting experiments.
The melting studies were carried out in teflon-stoppered 1 cm pathlegnth quartz cells under nitrogen atmosphere on a Varian Cary 1 UV-vis spectophotometer equipped with thermoprogrammer. Absorbance was monitored at 260 nm while temperature was raised at a rate of 0.5 °C/min; a slower heating rate with this apparatus does not affect the results. In all cases the complexes displayed sharp, apparently two-state transitions, with all-or-none melting from bound complex to free oligomers. Melting temperatures (Tm) were determined by computer fit of the first derivative of absorbance with respect to 1/T. Uncertainty in Tm is estimated at ±0.5 °C on the basis of repetitions of experiments. Free energy values were derived by computer fitting the denaturation data with an algorithm employing linear sloping baselines, using a two-state model for melting.21 Uncertainty in individual free energy measurements is estimated at ±5–10%.
We thank the National Institutes of Health and the Office of Naval Research for support. E.T.K. also acknowledges a Dreyfus Foundation Teacher-Scholar Award, an Alfred P. Sloan Foundation Fellowship, and an American Cyanamid Faculty Award.