|Home | About | Journals | Submit | Contact Us | Français|
A unnatural base pair that is replicated and transcribed with good efficiency would lay the foundation for the long term goal of creating a semi-synthetic organism, but also would have immediate in vitro applications, such as the enzymatic synthesis of site-specifically modified DNA and/or RNA. One of the most promising of the unnatural base pairs that we have identified is that formed between d5SICS and dMMO2. The ortho substituents of these nucleotides are included to facilitate unnatural base pair extension, presumably by forming a hydrogen-bond with the polymerase, but the synthesis of the unnatural base pair still requires optimization. Recently, we have shown that meta and/or para substituents within the dMMO2 scaffold can facilitate unnatural base pair synthesis, although the mechanism remains unclear. To explore this issue, we synthesized and evaluated several dMMO2 derivatives with meta-chlorine, -bromine, -iodine, -methyl, or -propinyl substituents. Complete characterization of unnatural base pair and mispair synthesis and extension reveal that the modifications have large effects only on the efficiency of unnatural base pair synthesis and that the effects likely result from a combination of changes in steric interactions, polarity, and polarizability. The results also suggest that functionalized versions of the propinyl moiety of d5PrM should serve as suitable linkers to site-specifically incorporate other chemical functionalities into DNA. Similar modifications of d5SICS should allow labeling of DNA with two different functionalities, and the previously demonstrated efficient transcription of the unnatural base pair suggests that derivatives might similarly enable site-specific labeling of RNA.
The four letter genetic alphabet is conserved throughout nature and is based on the two Watson-Crick base pairs formed via the complementary hydrogen-bonding (H-bonding) patterns of adenine with thymine and guanine with cytosine. However, significant and increasing effort has been directed toward re-engineering the alphabet to include a third, unnatural base pair.[1-14] Such unnatural base pairs might eventually be used as part of an expanded genetic code in vivo, but are likely to find more immediate in vitro applications, such as the site-specific labeling of enzymatically synthesized DNA or RNA. The earliest attempts toward expanding the genetic alphabet, pioneered by Benner and co-workers, relied on designing nucleobases with alternate H-bonding patterns. While several stable pairs were identified, this approach is complicated by facile tautomerization, although recent efforts have made progress in the identification of substituents that reduce the problem.[2, 15] Recent efforts have also drawn on results from Kool and co-workers demonstrating that H-bonding is not an absolute requirement for polymerase-mediated replication, and work from this[4, 5, 17-19] and other labs[12-14, 20-25] has clearly shown that hydrophobic and packing forces are sufficient not only for selective pairing in duplex DNA, but also for enzymatic DNA synthesis, and in some cases even RNA synthesis.[6, 12, 26]
Among the most promising unnatural base pairs that we have identified are those based on small, predominantly hydrophobic aromatic nucleobases that are derivatized with an H-bond acceptor at the position ortho to the glycosidic bond.[4-6, 19] After synthesis of the unnatural base pair (by insertion of the unnatural triphosphate opposite its cognate nucleotide in the template), these ortho substituents are required for efficient extension (by insertion of the next correct dNTP).[2, 7, 19, 23, 27-29] This likely results from the nucleotides adopting a syn orientation about the glycosidic bonds (in analogy to the natural nucleotides), which would position the H-bond acceptors in the developing minor groove where they participate in a functionally critical interaction with H-bond donors of A family polymerases.[29, 31, 32] For example, the nucleotides d5SICS and dMMO2 (Scheme 1A) form a particularly promising unnatural base pair whose relatively efficient replication is critically dependent upon the ortho sulfur and O-methoxy substituents.
Despite the relatively efficient recognition of the unnatural base pair, it is still not replicated as efficiently as a natural base pair and it is most limited by the rate of insertion of dMMO2 opposite d5SICS in the template.[3, 5] Given the essentiality of the ortho substituents of dMMO2, our continuing efforts toward its optimization have focused on meta and para derivatizations. We found that d5FMTP and dNaMTP (Scheme 1B) are each inserted opposite d5SICS with greater efficiency and fidelity than dMMO2TP, suggesting that, similar to ortho substituents, meta and para substituents can also facilitate polymerase recognition. However, unlike the ortho substituents, the mechanism by which the meta and para substituents affect replication remains unclear. Meta derivatization of dMMO2 is also interesting because such substituents are expected to positioned into the developing major groove, where, in analogy to the C5 substituents of natural pyrimidines, they might be used to attach functional groups, such as fluorophores or other moieties with interesting chemical or physical properties. Thus, understanding how different meta substituents affect replication is not only expected to help further optimize the unnatural base pair, but it should also help develop methodologies to site-specifically label oligonucleotides. The unnatural pairs formed between d5SICS and either dMMO2 or dNaM are also efficiently transcribed in both directions (i.e. each analogue efficiently directs the incorporation of the other in to RNA) and thus these studies are also expected to help develop methodologies to site-specifically label RNA with two different functional groups, each attached to one of the unnatural nucleotides.
To systematically explore the effect of meta substituents on DNA polymerase recognition, we synthesized and evaluated five dMMO2 derivatives (Scheme 1C). These include derivatives bearing meta-chlorine, -bromine, -iodine, -methyl, or -propinyl substituents. Unnatural base pair synthesis and extension are fully characterized, as is the synthesis and extension of all possible mispairs. The data allow a thorough assessment of the contribution of major groove substituent size and electrostatics to unnatural base pair replication. We find that the modifications only have large effects on the rates at which the unnatural triphosphates are inserted into the growing primer terminus, which appear to result from a combination of steric effects, polarity, and polarizability.
The unnatural nucleotides, d5ClM, d5BrM, d5IM, d5MeM, and d5PrM are dMMO2 derivatives that were designed as potential partners for d5SICS (Scheme 1C). When evaluated with the previously characterized nucleotides d5FM and dNaM (Scheme 1B), d5ClM, d5BrM, d5IM provide a systematic variation of the size and electrostatics of the major groove substituent. In addition, d5MeM and d5PrM were designed to characterize the effects of size and polarizability in the absence of large changes in dipole moment, and to also begin to explore the possibility of using meta attached linkers to append other functional groups.
The synthesis of each analogue is described in the Supporting Information. Briefly, all derivatives were synthesized via Heck coupling using the appropriate 2-methoxy benzene derivatives and 2′-deoxyribose glycal. In the case of d5BrM and d5IM, synthesis commenced with the dMMO2 nucleobase moiety, which was first coupled to tert-butyl-[[3-(tert-butyl-dimethylsilyloxy)-2,3-dihydrofuran-2-yl]methoxy]dimethylsilane to give the corresponding nucleoside and then halogenated. For d5ClM and d5MeM, 4-chloro-3-methylanisole and 3,4-dimethylanisol, respectively, were iodinated and then coupled to the same protected sugar. In each case, the major coupling product was the desired β-anomer, which was separated from the minor α-anomer by silica gel column chromatography. d5PrM was synthesized from d5IM via Sonogashira coupling.
Free nucleosides were converted to triphosphates or phosphoramidites using standard procedures, and phosphoramidites were used to synthesize DNA containing the unnatural nucleotides at a single position via standard procedures. In all cases, the effect of the modification on polymerase recognition was assessed by determining the steady-state efficiency (i.e., the second order rate constant kcat/KM) with which the Klenow fragment of E. coli DNA polymerase I (Kf) synthesizes the unnatural base pair, by insertion of the unnatural triphosphate opposite an analogue in the template, and extends the resulting unnatural primer terminus, by insertion of the next correct natural triphosphate. The corresponding rates of synthesis and extension for mispairs with natural nucleotides were also measured to determine fidelity.
To begin to characterize how the different substituents impact unnatural base pair replication, we first determined the steady-state rates of insertion of dMMO2 derivatives opposite d5SICS in the template (Table 1). For reference, with an otherwise identical primer-template, Kf inserts dATP opposite dT with a second-order rate constant of 3.2 × 108 M−1 min−1 and dMMO2, d5FM, and dNaM opposite d5SICS with second order rate constants of 3.6 × 105, 3.6 × 106, and 5.0 × 106 M−1 min−1, respectively.
Replacing the fluoro substituent of d5FM with a chloro, bromo, or iodo substituent results in reduced efficiencies of insertion for the unnatural triphosphates opposite d5SICS; however, the magnitude of the reduction is highly variable. For the chloro substituent, efficiency is reduced only 4-fold, while for the bromo substituent, it is reduced 150-fold. The pronounced effect with bromine substitution is not simply due to size, as the iodo substituted derivative d5IMTP is inserted with an efficiency that is intermediate between the chloro and bromo analogues. In fact, the efficiency of d5IMTP insertion is similar to that for insertion of the parent analogue dMMO2TP, which lacks a major groove substituent. This suggests that efficient insertion is disfavored by large major groove substituents, but favored by large dipole moments (i.e. d5FMTP) and polarizability (i.e. d5IMTP).
To further examine the effects of major groove derivatization in the absence of large changes in dipole moment, we examined d5MeMTP and d5PrMTP, in which a methyl and a propinyl group, respectively, are positioned in the developing major groove. Insertion efficiencies opposite d5SICS in the template are similar for the two analogues and nearly identical to that of dMMO2TP. This data suggests that simple hydrophobicity is not the dominant factor differentiating the recognition of the nucleotides. In addition, while substituent size within the halide series appears to be important, the increased size associated with the propinyl group relative to the methyl group has little affect on insertion efficiency. This may be due to a different spatial distribution, or to the offset of a small disfavorable interaction by a favorable increase in polarizability.
To examine how the modifications affect unnatural base pair synthesis when the dMMO2 derivative is in the template, we measured the rates of d5SICSTP incorporation (Table 2). In contrast to the large effects observed with triphosphate derivatization, modifications to the templating nucleobase had relatively little effect on the efficiency of unnatural base pair synthesis. Thus, we conclude that Kf is not sensitive to the major groove substituent of the dMMO2 derivatives in the template.
We next examined whether modification of the dMMO2 derivatives in the template affect the rates of mispair synthesis. We first measured the rate of insertion of each unnatural nucleotide opposite itself in the template (i.e. ‘self pair’ synthesis; note that with self pairs, modifications are made by definition to both the incoming and templating nucleotide) (Table 2). We observed the same substituent dependence for the efficiency of self pair synthesis as we did for insertion of the dMMO2 analogues opposite d5SICS in the template: the d5FMTP self pair was synthesized most efficiently, followed by the d5ClM and d5IM self pairs. The efficiency of self pair synthesis with d5MeM and d5PrM were again similar to each other, as well as to dMMO2, with rates ranging only from 1.2 × 105 to 6.2 × 105 M−1 min−1. Given these observations, we conclude that like unnatural base pair synthesis, the efficiency of self pair synthesis is more influenced by the nature of the triphosphate than the nature of the templating nucleobase.
Characterization of natural dNTP insertion opposite a dMMO2 derivative (Table 2) revealed that neither dGTP nor dCTP insertion is detectable (kcat/KM < 103 M−1 min−1) and that dTTP insertion is only barely detectable (kcat/KM = 2.4 – 7.3 × 103 M−1 min−1). In contrast, dATP is generally inserted more efficiently (1.0 × 105 M−1 min−1 – 7.4 × 105 M−1 min−1). As has been suggested with other predominantly hydrophobic nucleobases in the template, this likely results from a combination of adenine’s hydrophobicity, its packing ability, and an interstrand intercalation mode of pairing as discussed in detail previously. While mispair synthesis is highly dependent on which natural dNTP is inserted, the major groove substituent of the dMMO2 derivative in the template was again less important. Thus, the data support the hypothesis that the nature of the natural or unnatural triphosphate is more important for efficient synthesis than the nature of the templating unnatural nucleotide.
To characterize how the different major groove substituents affect the extension of the unnatural base pair, we synthesized primers that terminate at their 3′ end with one of the dMMO2 derivatives paired opposite d5SICS in the template. The next correct base in the template is dG, and the steady-state rate at which Kf inserts dCTP was measured (Table 3). For comparison, the unnatural pairs with dMMO2, d5FM, and dNaM, paired opposite d5SICS are extended with efficiencies of 1.9 × 106, 5.5 × 106, and 1.2 × 106 M−1 min−1, respectively. While the extension efficiencies of d5ClM:d5SICS (primer:template) and d5BrM:d5SICS are similar (2.0 × 106, and 2.6 × 106 M−1 min−1), extension of d5IM:d5SICS is slightly less efficient (7-fold). The contributions of sterics and electrostatics were deconvoluted by characterizing extension of the dMMO2:d5SICS, d5MeM:d5SICS, and d5PrM:d5SICS unnatural base pairs. While the meta methyl substituent has little effect on extension, while the propinyl group has a slightly deleterious effect (7-fold decreased). Thus, while substituents with increased size at the primer terminus (i.e. d5IM and dPrM) reduce the efficiency of extension, the effects are generally small and the data suggest that the modifications have less of an effect on replication once incorporated into the growing primer terminus than they do during unnatural triphosphate insertion.
Finally, we characterized extension efficiency and fidelity of primers terminating with d5SICS paired opposite each dMMO2 derivative in the template (Table 4). For comparison, the d5SICS:dMMO2, d5SICS:d5FM, and d5SICS:d5NaM pairs are extended with efficiencies of 6.9 × 105, 2.3 × 106, and 2.7 × 106 M−1 min−1. As with unnatural base pair synthesis, the extension efficiency of the modified unnatural base pairs is largely independent of template derivatization, with efficiencies ranging only from 2.2 × 106 to 5.3 × 106 M−1 min−1.
Extension of the dMMO2, d5FM, and dNaM self pairs is inefficient, with a kcat/KM of only 5.3 × 103 M−1 min−1 M to 2.6 × 104.. The derivatized self pairs are extended with similar or slightly reduced efficiencies, varying between 4.2 × 103 and 3.4 × 104 M−1 min−1 (Table 4). The d5MeM and d5PrM self pairs are extended with efficiencies of 6.5 × 104 and 4.2 × 103 M−1 min−1, respectively. Interestingly, we observed the same dependence on extension efficiency for the self pairs as we did for derivative triphosphate insertion opposite d5SICS, although the effects were smaller (10-fold versus 150-fold). As with dMMO2, extension of mispairs with dT or dC at the primer terminus is surprisingly efficient (6.6 × 105 - 5.6 × 106 M−1 min−1), while extension of mispairs with dA is slightly less efficient (2.2 × 105 – 9.6 × 105 M−1 min−1), and extension of mispairs with dG is not detectable. Generally, the nature of the templating dMMO2 derivative had little effect on mispair extension.
The unnatural base pair formed between d5SICS and dMMO2 is reasonably well replicated by Kf and was originally identified from the optimization of the most promising unnatural base pair identified from a screen of 3600 candidates. The substituents that are ortho to the glycosidic bond, and presumably oriented into the developing minor groove during DNA synthesis, appear to be essential. However, other modifications, for example, the increased aromatic surface area of dNaM or the halide of dFM result in even more efficient replication, suggesting that meta and para substituents may also be important. Thus, we examined meta derivatization of dMMO2 through halide substitution to alter substituent size and dipole moment, as well as through methyl and propinyl substitution to specifically examine the effects of substituent size and polarizability. In addition to elucidating the physical forces mediating unnatural base pair recognition, these studies were expected to aid development of a methodology to use the unnatural base pairs for site-specific modification of DNA and RNA.
Meta substitution had little effect on unnatural base pair or mispair synthesis with the dMMO2 derivatives in the template, and relatively small effects on extension with the derivatives present at either the primer terminus or in the template. Thus, the meta substituents do not significantly facilitate or interfere with these steps of replication. In contrast, the modifications examined had large effects on the efficiency of Kf-mediated triphosphate insertion, with the second-order rate constants varying 150-fold. The fluoro-modified triphosphate is inserted by Kf opposite d5SICS the most efficiently, while the chloro derivative is inserted only 4-fold less efficiently. While the bromo derivative is inserted 150-fold less efficiently than the fluoro derivative, the iodo derivative is inserted only 10-fold less efficiently. These trends do not simply parallel nucleobase hydrophobicity, dipole moment of the aryl-halide bond, or the van der Waals radii of the halide. Considering that significant packing interactions are likely to be introduced during nucleotide insertion, it is reasonable to assume that modifications that favor packing will facilitate insertion. However, because the van der Waals radii of bromine and iodine are the largest of the substituents tested (1.85 Å and 1.98 Å, respectively), and significantly larger than that expected to be easily accommodated between the nucleobases (which are separated by 3.34 Å in native B-form DNA), their presence is likely to be at least marginally destabilizing. Thus, it seems that increased dipole moment (5FM) or polarizability (5IM) favors unnatural triphosphate insertion, while the inclusion of substituents that are too large (5BrM and 5IM) disfavor it. The intermediate efficiency with which d5IMTP is inserted likely reflects opposed and somewhat compensating effects of favorable polarizability and disfavorable sterics. The similar efficiency with which d5PrMTP is inserted opposite d5SICS, relative to dMeMTP, may reflect similarly compensating effects of polarizability and sterics.
The results are particularly interesting from the perspective of the effort to expand the genetic alphabet. While relatively well replicated compared to other analogues, the replication of the unnatural base pair formed between d5SICS and dMMO2 is limited by the relatively slow insertion of dMMO2TP opposite d5SICS in the template. However, dMMO2 is better recognized by the polymerase during the other steps of replication (i.e. when in the template or during extension when present in either the primer or the template). Importantly, the data suggests that derivatizations at the meta position of dMMO2 might be used to specifically optimize the insertion of triphosphate, without interfering with the other steps of replication. The search for such meta modifications is currently underway.
For a site-specific labeling strategy that is compatible with polymerase amplification, the data suggest that propinyl linkers, such as those used already with the natural nucleotides, other unnatural base pairs,[12, 14] and present in d5PrM, are promising. These linkers are comprised of propargyl amines, with the amine serving as a reactive site to attach other functionalities. The data suggest that such linkers are also likely to be well tolerated within the dMMO2 scaffold. In this manner, virtually any functionality might be attached to the dMMO2 derivative and site-specifically incorporated into DNA during enzymatic synthesis. Moreover, it seems likely that d5SICS derivatives will allow analogous site-specific modification, such that a single duplex could be site-specifically modified with two different functional groups. The efficient transcription of the unnatural base pair would also allow site-specific labeling of RNA with two different moieties. Such DNA and RNA should find different academic and biotechnological applications, and efforts toward these applications are currently underway.
All reactions were carried out in oven-dried glassware under inert atmosphere, and all solvents were dried over 4 Å molecular sieves with the exceptions of dichloromethane, which was distilled from CaH2, and tetrahydrofuran, which was distilled from sodium and potassium metal. All other reagents were purchased from Aldrich. All unnatural nucleosides and nucleotides used in this study were synthesized as described in Supporting Information. 1H, 13C, and 31P NMR spectra were recorded on Bruker DRX-500, Varian Inova-400, or Mercury 300 spectrometers. High resolution mass spectroscopic data were obtained from the core facility at The Scripps Research Institute. T4 polynucleotide kinase was purchased from New England Biolabs, Kf from GE Healthcare, and [γ-32P]-ATP was purchased from MP Biomedicals.
Oligonucleotides were prepared by the β-cyanoethylphosphoramidite method on controlled pore glass supports (1 μmol) by using an Applied Biosystems Inc. 392 DNA/RNA synthesizer as standard method. After automated synthesis, the oligonucleotides were cleaved from the support and deprotected by heating at 55 °C for 12 h. The crude product was further purified by polyacrylamide gel electrophoresis, followed by electroelution. The resulting purified oligonucleotides were precipitated with 80% ethanol and dried overnight. Oligonucleotide concentration was determined by UV absorption. DNA primers with unnatural nucleotides at their 3′-termini were synthesized using 3′-phosphate CPG (Glen Research) and then purified as described above, followed by treatment for 1 hour at 37 °C with CIP (0.5 U) (New England Biolabs) to produce free 3′-OH groups.
Primer oligonucleotides were 5′ radiolabeled with [γ-32P]-ATP and T4 polynucleotide kinase. Templates were annealed to primers in the reaction buffer by heating to 90 °C followed by slow cooling to ambient temperature. Assay conditions included 40 nM primer/template, 0.1-1.3 nM Kf, 50 mM Tris-HCl, pH 7.5, 10 mM MgCl2, 1 mM DTT, and 50 μg/mL acetylated BSA. The reactions were carried out by combining the DNA-enzyme mixture with an equal volume (5 μL) of 2× dNTP stock solution, incubating at 25 °C for 1-10 min, and quenching by the addition of loading dye (95% formamide, 20 mM EDTA, and sufficient amounts of bromophenol blue and xylene cyanol; 20 μL). The reaction mixture was then analyzed by 15% polyacrylamide/8 M urea denaturing gel electrophoresis. Radioactivity was quantified using a Phosphorimager and the ImageQuant program (Molecular Dynamics) with overnight exposures. The kobs values were plotted against triphosphate concentration and the data was fit to the Michaelis-Menten equation (Kaleidagraph, Synergy Software) to determine kcat and KM. The data presented are the average of three independent determinations.
Funding was provided by the National Institutes of Health (GM60005 to F.E.R.) and Korea Research Foundation (KRF-2006-352-C00047 to Y.J.S.).
Supporting information for this article is available on the WWW under http://www.chembiochem.org or from the author.