|Home | About | Journals | Submit | Contact Us | Français|
Locked nucleic acids (LNA; symbols of bases, +A, +C, +G, and +T) are introduced into chemically synthesized oligonucleotides to increase duplex stability and specificity. To understand these effects, we have determined thermodynamic parameters of consecutive LNA nucleotides. We present guidelines for the design of LNA oligonucleotides and introduce free online software that predicts the stability of any LNA duplex oligomer. Thermodynamic analysis shows that the single strand–duplex transition is characterized by a favorable enthalpic change and by an unfavorable loss of entropy. A single LNA modification confines the local conformation of nucleotides, causing a smaller, less unfavorable entropic loss when the single strand is restricted to the rigid duplex structure. Additional LNAs adjacent to the initial modification appear to enhance stacking and H-bonding interactions because they increase the enthalpic contributions to duplex stabilization. New nearest-neighbor parameters correctly forecast the positive and negative effects of LNAs on mismatch discrimination. Specificity is enhanced in a majority of sequences and is dependent on mismatch type and adjacent base pairs; the largest discriminatory boost occurs for the central +C·C mismatch within the +T+C+C sequence and the +A·G mismatch within the +T+A+G sequence. LNAs do not affect specificity in some sequences and even impair it for many +G·T and +C·A mismatches. The level of mismatch discrimination decreases the most for the central +G·T mismatch within the +G+G+C sequence and the +C·A mismatch within the +G+C+G sequence. We hypothesize that these discrimination changes are not unique features of LNAs but originate from the shift of the duplex conformation from B-form to A-form.
A locked nucleic acid (LNA) is a useful chemical modification.1−5 Mixed oligonucleotides consisting of LNA, DNA, and RNA residues have improved polymerase chain reaction (PCR) experiments,(6) single-nucleotide polymorphism assays,7,8 RNA interference,1,4 antisense mRNA technology,(2) microRNA profiling and regulation,9,10 aptamers,(11) LNAzymes,(3) microarrays,(12) and nanomaterials.(13) These applications require that LNA oligonucleotides possess specific melting temperatures (Tm) and free energies of association for complementary sequences (ΔG°).(5)
The thermodynamic stability of nucleic acid duplexes has been described with the nearest-neighbor model, which takes into account energetics of nearest-neighbor base pairs and assumes that interactions beyond neighboring nucleotides can be neglected.14−17 The total enthalpy and entropy of duplex annealing are calculated by summation of doublet terms
where Nbp is the number of duplex base pairs. The first term on the right side of eq 1 is the sum over all internal nearest-neighbor doublets (ΔH°i,i+1). The second term (ΔH°init) represents the “initiation” enthalpy, which includes the formation of the duplex first base pair, corrections for the extra hydrogen bond of G·C versus A·T in terminal base pairs,(17) and terminal base–solvent interactions. The initiation parameter varies with the nature of terminal base pairs.15,16 Equation 2 also includes an entropic symmetry correction (ΔS°symmetry) of −1.4 cal mol–1 K–1, which is added when a duplex consists of two identical, self-complementary oligonucleotides.
The nearest-neighbor model accurately predicts thermodynamics and melting temperatures (±1.5 °C) of native oligonucleotides.15−19 It appears that the nearest-neighbor model also predicts well single-base mismatches18,20 and some chemical modifications, including single LNAs.21−24 Because LNAs increase duplex stability and change the specificity of base pairing,2,5 LNA nearest-neighbor parameters differ significantly from DNA parameters.
The LNA parameter set is incomplete and does not cover many useful sequences. Thermodynamic parameters have been published for isolated LNA·RNA base pairs introduced into 2′-O-methyl RNA oligonucleotides25,26 and for isolated LNA·DNA base pairs.(21) However, many applications benefit from other types of LNA modifications. For example, a triplet of LNA residues appears to maximize mismatch discrimination and improves single-nucleotide polymorphism assays.(5) Fully LNA-modified probes can selectively capture genomic DNA sequences.(27) To determine the parameters for consecutive LNAs, we measured the stability of duplexes using the fluorescence melting method.(28) The energetics of LNA effects was determined from the difference between LNA-modified and native (core) duplexes. Because we used standard experimental conditions (1 M Na+ and pH 7), new parameters are compatible with existing DNA parameters.
Oligonucleotides were synthesized at Integrated DNA Technologies, purified by HPLC,(29) and dialyzed against storage buffer [10 mM Tris-HCl and 0.1 mM Na2EDTA (pH 7.5)].(28) Concentrated oligonucleotide samples were tested by mass spectroscopy (molecular weights were within 2 g/mol) and capillary electrophoresis (>90% pure). DNA concentrations were determined from predicted extinction coefficients (ε) and sample absorbance at 260 nm using the Beer–Lambert law.29,30 LNA nucleotides were assumed to possess the same extinction coefficients as DNA ones. Coefficients of Texas Red (14400 L mol–1 cm–1) or Iowa Black RQ (44510 L mol–1 cm–1) were added to the ε of labeled oligonucleotides.
Figure Figure1A1A shows the sequences studied. Fluorescent Texas Red dye (TXRD) is attached at the 5′ end of the top strand, and Iowa Black RQ quencher (IBRQ) is attached at 3′ end of the complementary strand. This design efficiently quenches fluorescence when the strands are annealed because the dye and the quencher are in close contact. We use notation of oligonucleotide manufacturers; LNA nucleotides are indicated with + in front of the base symbol (e.g., +A denotes an adenine LNA nucleotide). Cytosine of +C is 5-methylated because oligonucleotide manufacturers usually synthesize the methylated version of LNA cytosine.
The set of DNA duplexes contains a triplet of consecutive LNAs located either in the interior of the strand labeled with Texas Red or in the interior of the complementary strand labeled with Iowa Black RQ. Eight possible LNA·DNA base pairs (X·Y +A·T, A·+T, +T·A, T·+A, +C·G, C·+G, +G·C, and G·+C), and 24 mismatches were introduced at the X·Y site. Core duplexes were also measured. They contained DNA·DNA base pairs (X·Y A·T, T·A, C·G, and G·C), and the same terminal Texas Red–Iowa Black RQ pair was also measured. This design is economical; each oligonucleotide is used in several duplexes. Thirty-six duplexes were melted for each set except for set 3. This set consisted of 27 duplexes because its two sequences, GTAGGGGTGCT-IBRQ and GTA+G+G+GGTGCT-IBRQ, were not obtained with sufficient purity.
For sets 1–4, the same base flanks the X·Y site on the 5′ and 3′ sides. For sets 5–8, the flanking bases are different and each of the four bases (A, T, C, and G) occurs once on the 5′ and 3′ sides of the X·Y base pairs. This design ensures that every possible nearest-neighbor interaction is present several times within the data set.
Figure Figure1A1A also shows that duplex lengths range from 10 to 12 bp. Such short sequences are likely to melt in a two-state manner. Nevertheless, non-two-state behavior may occur even for short oligonucleotides if they form stable self-complementary structures, e.g., hairpins or dimers. OligoAnalyzer version 3.1 (http://www.idtdna.com/analyzer/Applications/OligoAnalyzer/) confirmed that our sequences do not form such structures.
This paper follows previous conventions to represent duplex sequences.(16) A slash divides the strands in an antiparallel orientation. The sequence is oriented 5′ to 3′ before the slash and 3′ to 5′ after the slash (for example, CA/GT represents the 5′-CA-3′/3′-GT-5′ doublet with two Watson–Crick base pairs). Mismatched nucleotides are underlined or colored red. Ribonucleotides are distinguished from deoxyribonucleotides by the “r” prefix, e.g., rA.
We followed our previously described method for fluorescence melting experiments.(28) The melting buffer contained 1 M NaCl, 3.87 mM NaH2PO4, 6.13 mM Na2HPO4, and 1 mM Na2EDTA and was adjusted to pH 7.0 with 1 M NaOH.(30) Buffer reagents of p.a. grade purity were bought from ThermoFisher Scientific (Pittsburgh, PA).
Melting experiments were performed at 13 different total single-strand concentrations (19, 30, 46, 70, 110, 160, 250, 375, 570, and 870 nM and 1.3, 2.0, and 3.0 μM). Duplex samples were prepared at the highest Ct of 3 μM. Complementary oligonucleotides were mixed in a 1:1 ratio in the melting buffer, heated to 95 °C, and slowly cooled to room temperature. Aliquots of the 3 μM solution were diluted with the melting buffer to make 12 remaining samples. Low-binding Costar microcentrifuge tubes (catalog no. 3207, Corning, Wilkes Barre, PA) were used to reduce the level of binding of oligonucleotides to the tube surface.
We pipetted 25 μL of the melting sample into two wells of a 96-well PCR plate (Extreme Uniform Thin Wall Plates, catalog no. B70501, BIOplastics BV, Landgraaf, The Netherlands). A significant discrepancy between wells alerted us to an erroneous measurement. Using the Bio-Rad iQ5 real-time PCR system, the fluorescence signal in the Texas Red channel was recorded every 0.2 °C while the temperature was increased from 4 to 98 °C and decreased back to 4 °C over two cycles. Subsequent temperature cycles were not used because they were unreliable; Tm sometimes increased, indicating the evaporation of water or degradation of dye. The iQ5 system maintained a temperature rate of 25 °C/h. Analysis was conducted in Microsoft Excel. We programmed VBA software to automate melting profile analysis, including baseline selection using a second-derivative algorithm.(28) The fraction θ was calculated [θ = (F – FL)/(FU – FL)] from the fluorescence of the DNA sample (F), the fluorescence of the upper linear baseline (FU), and the fluorescence of the lower linear baseline (FL). If a duplex melts in a two-state manner, dissociation of the fluorophore from the quencher is likely coupled to the duplex-to-single strand melting transition and θ represents the fraction of melted duplexes.
The melting temperature was defined as the temperature at which θ = 1/2. The average standard deviation of Tm values was 0.4 °C. Transition enthalpies, entropies, and free energies were determined from fits to individual melting profiles and from the dependence of melting temperature on DNA concentration.14,28,31,32 These two analytical methods assume that melting transitions proceed in a two-state manner; that is, intact duplex and unhybridized single strands are dominant, and partially melted duplexes are negligible throughout the melting transition. The methods also assume that transition enthalpies and entropies are temperature-independent. If ΔH° or ΔS° values differed more than 15% between these two methods, the duplex did not melt in a two-state manner.28,32,33 In that case, we excluded ΔH° or ΔS° values from further analysis because they were inaccurate.
Locked nucleic acids increase duplex stability and alter the melting transition enthalpy, entropy, and free energy. As shown in Figure Figure1B,1B, we determined these LNA contributions (ΔΔH°, ΔΔS°, and ΔΔG°37) from the difference between LNA-modified and core duplexes.(28) LNA modifications were located at least five nucleotides from the terminal fluorophore and the quencher. In this design, terminal labels do not interact with LNAs and do not influence differential thermodynamic values between modified and core duplexes.
Figure Figure1B shows1B shows an example of the analysis for the Set1–11 duplex. Entering ΔH° from Table S1 of the Supporting Information, we determined the experimentally measured differential enthalpic change [ΔΔH°(A+T+G+TC/TACAG)] to be −97.6 – (−86.4) = −11.2 kcal/mol. In the nearest-neighbor model, this enthalpic contribution is a sum of enthalpies of base pair doublets
Rearrangement of eq 3 places unknown LNA parameters on the left side
The right side of eq 4 contains the experimentally measured enthalpic change and two previously determined nearest-neighbor parameters.(21) McTigue, Peterson, and Kahn investigated the thermodynamics of interactions between LNA·DNA and DNA·DNA base pairs. We used their parameters to account for LNA–DNA interactions that occur in the beginning and at the end of a section of consecutive LNAs. Parameters from their 32NN set (Table 4 of ref (21)) were entered into eq 4
A similar equation was constructed for each LNA duplex. Analogous equations were set up for ΔΔS° and ΔΔG°37 contributions.
Selecting two bases from the set of four (A, T, C, and G) with replacement leads to the creation of 16 nearest-neighbor doublets.(34) Because antiparallel strands of native DNA duplexes exhibit structural symmetry, some doublet sequences are identical, e.g., AC/TG and GT/CA. Therefore, 10 nearest-neighbor parameters are sufficient to represent internal DNA·DNA doublets. No such symmetry exists for LNA·DNA base pairs. The +A+C/TG doublet differs from the +G+T/CA doublet. Sixteen nearest-neighbor parameters are needed for consecutive LNA·DNA base pairs.
We measured 62 perfectly matched LNA duplexes. Sixty of them melted in a two-state manner. Their thermodynamic values were used to determine the parameters. Each of the 16 LNA doublets was well represented in this data set with the following numbers of occurrences: 8 +A+A/TT, 8 +A+C/TG, 8 +A+G/TC, 8 +A+T/TA, 8 +C+A/GT, 6 +C+C/GG, 7 +C+G/GC, 8 +C+T/GA, 8 +G+A/CT, 7 +G+C/CG, 4 +G+G/CC, 8 +G+T/CA, 8 +T+A/AT, 8 +T+C/AG, 8 +T+G/AC, and 8 +T+T/AA.
First, we examined enthalpic effects. Equation 5 was constructed for each LNA duplex. This thermodynamic analysis produced the set of 60 linear equations
where M is a 60 × 16 matrix of the number of occurrences for each LNA nearest-neighbor doublet in 60 duplexes, Hn–n is the vector of 16 unknown parameters, and Hexp is the column vector of experimentally measured enthalpic contributions. The parameters reported by McTique et al. were subtracted from the enthalpic contributions as shown in eqs 4 and 5. Because the number of unknown parameters (16) was less than the number of equations (60), eq 6 was overdetermined.(35)We solved it using singular-value decomposition (SVD)(36) by minimizing χ2
where σH is the diagonal matrix whose elements are experimental errors of ΔΔH°. Because these errors were similar, they were set to a constant value of 3 kcal/mol and the SVD fit was not error weighted. Singular-value decomposition was conducted using Microsoft Excel Add-in, Matrix.xla package, version 2.3.2 (Foxes Team, L. Volpi, http://digilander.libero.it/foxes). Calculations were repeated using the Excel LINEST function, yielding the same values. We also examined matrix M for degeneracies. The rank of matrix M was 16. Because the rank was equal to the number of unknown parameters, the matrix had no singular values, and the parameters were unique and linearly independent.34−36
Error estimates of parameters were obtained from bootstrap simulations.(37) These calculations estimate the dependence of parameter values on the data set. Many bootstrap data sets were created from the original data set. A different value of the parameter was usually determined from each bootstrap data set. The bootstrap estimate of the parameter error is given by the standard deviation of all these parameter values.
In our simulations, the bootstrap data sets were the same size as the original data set; i.e., the sets contained data from 60 duplexes. The duplex data were randomly drawn, with replacement, from the original data set. This means that the entire experimental data set was used in each drawing. This procedure produced bootstrap data sets in which some duplex data from the original data sets were present multiple times and other data were not selected. We generated 5 × 104 bootstrap data sets. Equation 6 was solved for each data set using SVD, and 16 parameters for the consecutive LNAs were determined. If the rank of M was less than 16, the particular bootstrap data set did not contain all possible nearest-neighbor doublet sequences. Thermodynamic parameters could not be determined in this case; therefore, the bootstrap set was excluded from analysis, and a replacement data set was drawn. Fewer than 3% of the data sets were excluded. Standard deviations and averages were calculated from bootstrap parameter estimates. The average parameters determined from bootstrap analysis agreed with the parameters determined from the original data set.
We have analyzed the error in free energy calculated from entropic and enthalpic contributions (ΔΔG° = ΔΔH° – TΔΔS°). Enthalpies and entropies of DNA melting transitions are correlated.(38) The errors of the enthalpic contribution, σ(ΔΔH°), and the entropic contribution, σ(ΔΔS°), are also highly correlated; their correlation coefficient is usually above 0.99.17,21 If the uncertainty in ΔΔG° is estimated by error propagation,(39) the covariance cov(ΔΔH°,ΔΔS°) significantly decreases the error
Equation 8 indicates that the free energy is determined more precisely than the enthalpic or entropic contributions alone. The similar error compensation decreases the error in the melting temperature calculated from ΔH° and ΔS°.(17) This analysis demonstrates that it is useful to report the ΔΔH° and ΔΔS° parameters in Tables Tables11–3 beyond their individual errors. If the parameters are rounded to their error estimates, the calculated free energies and melting temperatures may be less precise.
Validation sets were measured by ultraviolet spectroscopy as previously described.(30) Absorbance at 268 nm was recorded every 0.1 °C using a Beckman DU 650 spectrophotometer. The temperature was changed at a rate of 25 °C/h in the range from 10 to 98 °C using a high-performance temperature controller (Beckman-Coulter, Brea, CA). Both heating and cooling melting profiles were collected. Sloping baselines were subtracted from the melting profiles,(30) and the melting temperature was defined as the temperature at which the fraction of melted duplexes equaled 0.5.
There are 12 possible LNA·DNA mismatches (+A·A, +C·C, +G·G, +T·T, +A·C, +C·A, +A·G, +G·A, +C·T, +T·C, +G·T, and +T·G). In our design, mismatches were located in the center of LNA triplets. Enthalpic, entropic, and free energy effects were determined from the differences between the energetics of LNA mismatch duplexes and core DNA duplexes.
As an example, let us consider the Set1–17 duplex containing the +G·G mismatch. The enthalpic contribution from the A+T+G+TC/TAGAG duplex subsequence is calculated from the difference in the total enthalpy of the Set1–17 (TXRD-CGTCA+T+G+TCGC) and Set1–10 (TXRD-CGTCATGTCGC) duplexes (Table S1 of the Supporting Information)
The nearest-neighbor model assumes that this contribution is the sum of four nearest-neighbor doublets. We used the parameters of McTigue et al.(21) for two doublets (A+T/TA and +TC/AG). An equation similar to eq 4 was constructed. The left side contains two unknown parameters
Equation 10 was built for each mismatched duplex. Sequences having the +G·G mismatch were grouped into the subset. The resulting system of linear equations was overdetermined and was solved by SVD analysis. Eight unknown nearest-neighbor parameters (+A+X/TY, +C+X/GY, +G+X/CY, +T+X/AY, +X+A/YT, +X+C/YG, +X+G/YC, and +X+T/YA) were obtained (+X +G, and Y G). This procedure was repeated for 12 +X·Y mismatch types, and 96 parameters (8 × 12) were determined. The thermodynamic values of LNA duplexes containing mismatches can be predicted from eqs 1 and 2 using new parameters.
SVD analysis indicated that the number of linearly independent equations and the rank of matrix M was seven for mismatches. Eight fitted nearest-neighbor parameters are useful, but they are not a unique solution34,35,40 because a constraint equation relates the numbers of eight doublets (N)
The constraint decreases the number of unique parameters to seven for each mismatch type. Equation 11 is valid for duplexes containing mismatches within consecutive LNAs.
Unique, linearly independent parameters can be constructed from linear combinations of eight nonunique parameters. The similar constraint limits the number of unique parameters for some DNA mismatches. Allawi and SantaLucia proposed seven linearly independent sequences for DNA mismatches.(41) They added a C·G base pair to nonunique doublets to create linearly independent triplets. Using a similar procedure, we have added the +C·G base pair to LNA doublets and created seven unique triplets (+A+X+C/TYG, +C+X+C/GYG, +G+X+A/CYT, +G+X+C/CYG, +G+X+G/CYC, +G+X+T/CYA, and +T+X+C/AYG). A single LNA mismatch lies in the center. Using SVD analysis, seven parameters for those triplets were determined for each +X·Y mismatch type (Table S3 of the Supporting Information).
The energetics of any LNA +X·Y mismatch sequence (+K+X+M/LYN) could also be calculated from unique triplet parameters
The +K·L and +M·N pairs are Watson–Crick base pairs adjacent to the mismatch. The numbers of seven triplets, N(triplet), are related to the numbers of doublets: N(+A+X+C/TYG) = N(+A+X/TY), N(+C+X+C/GYG) = N(+C+X/GY), N(+G+X+A/CYT) = N(+X+A/YT), N(+G+X+C/CYG) = N(+G+X/CY) – N(+X+A/YT) – N(+X+G/YC) – N(+X+T/YA), N(+G+X+G/CYC) = N(+X+G/YC), N(+G+X+T/CYA) = N(+X+T/YA), and N(+T+X+C/AYG) = N(+T+X/AY). The doublet and triplet parameter sets predict identical thermodynamic values for any LNA mismatch sequence. Both parameter sets are implementations of the same nearest-neighbor model and do not take into account any next-nearest-neighbor or longer interactions.
Thermodynamic values were measured for the primary oligonucleotide set using fluorescence.(28) The melting process was monitored using Texas Red dye and Iowa Black RQ quencher, which were attached at the termini of duplexes. These labels appear to be optimal for melting experiments, as other fluorophores (FAM, HEX, and TET) do not provide reliable thermodynamic values and may ruin the two-state nature of melting transitions.(28)
Fluorescence versus temperature plots always exhibited single, S-shaped transitions that were reversible. Figure Figure22 presents examples of averaged melting profiles. Pictured duplexes have a TXRD-CGTCA+T+A+TCGC base sequence. The DNA matched duplex (dashed line) is more stable than the LNA duplex containing the +A·A mismatch (dotted line) in Figure Figure2.2. This stability order is sequence-dependent and not universally observed. If LNAs cause large duplex stabilization and a single mismatch destabilizes a duplex less, the mismatched LNA duplex will be more stable than the matched DNA duplex of the same base sequence. This occurs often for +G·T, +T·G, +G·G, and +G·A mismatches.
Thermodynamic values were extracted from melting profiles. First, the enthalpy, entropy, and free energy were estimated from fits to individual melting profiles.(28) We fitted only data within the transition where fraction θ ranged from 0.15 to 0.85. Second, ΔH°, ΔS°, and ΔG°37 were determined from graphs of 1/Tm versus ln Ct/4. These graphs were linear over a 150-fold range of 13 DNA concentrations (Figure S1 of the Supporting Information). When the 1/Tm data point deviated from the fitted straight line by a value more than twice the value of the propagated error, it was removed from the fit as an outlier. Fewer than 1% of all graph points were excluded. Melting temperatures and thermodynamic values for the primary data set are presented in Table S1 of the Supporting Information. The enthalpy, entropy, and free energy are negative because they are reported for the annealing reaction, which is customary practice.
Our thermodynamic analysis assumed a two-state nature of melting transitions. When this assumption is valid, both 1/Tm versus ln Ct/4 plots and fits to melting profiles yield the same results. If thermodynamic values differed more than 15% between these two methods, the specific duplex did not melt in a two-state fashion, and its thermodynamic data were removed from further analysis, averages, and fitting of nearest-neighbor parameters. For the primary data set, average differences between both methods in ΔH°, ΔS°, and ΔG°37 values were 7.2, 8.3, and 2.5%, respectively.
Duplexes exhibiting deviations from the two-state melting behavior are listed in Table S1 of the Supporting Information. The non-two-state melting transitions may occur when the cooperativity of the melting process is low and the duplex melts in several stages. The oligonucleotides can also fold into alternative stable structures, broadening the melting transition or splitting it into two S-shaped transitions. We did not observe the second transition in any melting profile. Duplexes also have a terminal dye–quencher pair that can interact with neighboring base pairs; this could change duplex melting behavior and local base pair cooperativity. Because fluorescence depends on dye–quencher distance and orientation, the fluorescent signal is more sensitive to non-two-state behavior than the UV absorbance signal. If dissociation of the dye from the quencher does not coincide with duplex melting, discrepancies in thermodynamic analysis are likely to occur and thermodynamic values could be inaccurate.
The majority of duplexes in the data set (>93%) exhibited two-state melting transitions, and the average ΔH°, ΔS°, and ΔG°37 values of those duplexes were used to determine nearest-neighbor parameters. Table Table11 shows the nearest-neighbor parameters for consecutive LNAs. Standard errors were estimated from bootstrap analysis. The free energy values calculated from the Gibbs thermodynamic relation (ΔΔG°37 = ΔΔH° – 310.15ΔΔS°) agreed within 0.09 kcal/mol with the ΔΔG°37 values determined from SVD analysis. This agreement confirms the consistency of our method.
Because ΔΔG°37 is negative for all nearest-neighbor doublets in Table Table1,1, consecutive LNAs always stabilize a DNA duplex and the effect is sequence-independent. The most stabilizing doublets are +C+C/GG (ΔΔG°37 = −2.3 kcal/mol) and +G+G/CC (ΔΔG°37 = −2.0 kcal/mol). The smallest LNA impact is seen for the +A+A/TT (−0.6 kcal/mol) and +T+T/AA (−0.8 kcal/mol) sequences. Effects of LNAs on ΔΔG°37 are approximately proportional to the duplex fraction of G·C base pairs. Introduction of LNAs stabilizes cytosine-guanine base pairs ~0.9 kcal/mol more than adenine-thymine base pairs. The ΔΔS° values vary widely from −23.5 to 0.7 cal mol–1 K–1.
Thermodynamic parameters in Table Table11 are differential thermodynamic parameters; i.e., they represent deviations from native DNA duplexes. To calculate the total enthalpy for any LNA-modified sequence, one predicts the transition enthalpy for the native DNA duplex (ΔH°) according to eq 1 and adds the differential parameters (ΔΔH°) to take into account LNA effects
Both sums of eq 13 contain the same doublet sequences; the difference is in LNA modification (CA/GT vs +C+A/GT). Parameters for the same base sequences could be combined. Addition of differential LNA parameters (ΔΔH°) and DNA nearest-neighbor parameters(16) gives full nearest-neighbor LNA parameters (ΔH°)
where +K+X/LY is a nearest-neighbor doublet. We present full thermodynamic parameters for consecutive and isolated LNA modifications in Table Table2.2. It is faster and takes fewer computer resources to calculate thermodynamic values from full thermodynamic parameters than from differential ones.
As an example, we present calculations for the perfectly matched 5′-TA+C+AGG-3′ duplex.
The first and last parameters represent initiation interactions using the concept of a fictitious end base (E).15,16,34 Transition entropies and free energies can also be casted into full parameters using analogous relationships.
To verify the analysis and applicability of the nearest-neighbor model, we used new parameters to predict thermodynamics of the primary data set. New LNA parameters accurately predicted ΔH°, ΔS°, and ΔG° values for these short duplexes. The average relative errors were 3.3, 3.5, and 2.9%, respectively. This is comparable to the accuracy reported for nearest-neighbor parameters of native nucleic acids where standard deviations of thermodynamic values ranged from 3 to 8%.(17)
To estimate the robustness of the new parameters, it is important to test their performance with an independent validation set of duplex oligomers that were not used to derive the parameters. We have measured 53 additional LNA-modified duplexes. The oligonucleotides did not have any fluorescent labels or quenchers attached. Their melting transitions were followed using UV spectroscopy.(30) These LNA duplexes ranged from 8 to 10 bp in lengths, from 10 to 88% in G·C content, and from 20 to 60% in LNA content. Figure Figure33 presents a comparison of experimentally measured melting temperatures with predictions. Good agreement is observed. Additional details are listed in Table S2 of the Supporting Information. The new parameters in Table Table22 result in an average Tm prediction error of 2.1 °C (χ2 = 2549).
Exiqon also developed a thermodynamic model of locked nucleic acids.(42) Because their parameters have not been publicly disclosed and the algorithm has not been described in detail, we relied on Tm predictions that were obtained online using their software. Comparison of experimental melting temperatures reveals that the Exiqon model tends to overestimate melting temperatures for our validation set. The average Tm prediction error is 4.2 °C, and χ2 is equal to 7981. This level of accuracy agrees with the values reported by the developers where a standard deviation of 5.0 °C was obtained for Tm predictions of chimeric LNA·DNA duplexes.(42) Assuming a normal distribution of measured melting temperatures, probability P of the null hypothesis that this χ2 difference occurs by random chance is less than 0.01. Thus, a two-tailed F-test for the ratio of χ2 values30,35 indicates that the new parameters from Table Table22 predict melting temperatures more accurately than the Exiqon software.
From the primary data set, we determined nearest-neighbor parameters for single mismatches using SVD analysis. Table Table33 shows eight doublet parameters for each of 12 LNA mismatch types. Thermodynamic parameters are influenced by flanking base pairs and the type of mismatch. The doublet format of nearest neighbors simplifies software implementation, but eight parameters for mismatch doublets are not unique, which was demonstrated in Materials and Methods. The constraint equation (eq 11) limits the number of linearly independent parameters to seven for each mismatch type. The unique parameters were constructed in triplet format and are listed in Table S3 of the Supporting Information.
To investigate trends and relationships of mismatch stabilities, we predicted thermodynamic values for all possible LNA triplets with a central mismatch. Matched base pairs flank the mismatch on both 5′ and 3′ sides. There are four possibilities for each flanking base pair (+A·T, +T·A, +C·G, and +G·C). Sixteen triplets, therefore, exist for each mismatch type (+A+X+A/TYT, +A+X+C/TYG, +A+X+G/TYC, +A+X+T/TYA, +C+X+A/GYT, +C+X+C/GYG, +C+X+G/GYC, +C+X+T/GYA, +G+X+A/CYT, +G+X+C/CYG, +G+X+G/CYC, +G+X+T/CYA, +T+X+A/AYT, +T+X+C/AYG, +T+X+G/AYC, and +T+X+T/AYA). There are 4 × 3 = 12 mismatch types because three types exist for each LNA nucleotide (for example, +A·A, +A·C, and +A·G for LNA adenine). The total number of unique triplets is therefore 16 × 12 = 192. Contributions to the free energy of the duplex transition (ΔG°37) were predicted for these triplets containing LNA mismatches, DNA mismatches, and related perfectly matched sequences using parameters from Tables Tables22 and and33 and ref (20). Table 1 of ref (20) seems to have typographical errors, so we used parameters from Table 2 of ref (16) instead. To model mismatch effects in the interior of a duplex, initiation free energies were not taken into account.
The LNA triplets were sorted according to free energy contributions. The least stable LNA mismatch is +A+C+T/TCA (ΔG°37 = 2.7 kcal/mol). The same C·C mismatch context is also the most destabilizing for DNA·DNA single-base mismatches.(43)
The most stable LNA mismatch is the +G·T mismatch within the context of +G+G+C/CTG (−5.5 kcal/mol). It is interesting that the most stabilizing DNA mismatch occurs in the same sequence context, but it is the G·G mismatch instead, GGC/CGG (−2.2 kcal/mol).
Average ΔG°37 values over 16 triplet contexts produced a trend of decreasing stability for mismatches within consecutive LNA·DNA base pairs: +G·T +G·G > +T·G ≈ +G·A > +C·A > +T·T > +A·G ≈ +C·T > +A·A > +A·C ≈ +T·C > +C·C. The trend of relative stabilities of RNA·RNA mismatches closely resembles this trend:(44) rG·rU rG·rG > rU·rU > rA·rC > rC·U > rA·A ≈ rA·rG ≈ rC·rC. The stability trend of DNA·DNA mismatches shows some similarities:(43)G·G > G·T ≈ G·A > T·T ≈ A·A > T·C > A·C > C·C. The main differences between LNAs and DNAs are the higher relative stabilities of +G·T and +C·A mismatches and the lower relative stability of the +A·G mismatch. The order of stability of hybrid RNA·DNA mismatches is between the trends of RNA·RNA and DNA·DNA mismatches.45,46 The most stable mismatch is the rG·T mismatch, like in RNAs, while the rC·A mismatch has relatively low stability, like in DNAs.
To study the dependence of mismatch discrimination on oligonucleotide sequence, the free energy of mismatch discrimination (ΔΔG°) was defined as the difference between mismatched and matched duplexes. The ΔΔG° value quantifies the amount of destabilization due to a mismatch. Let us define G·G mismatch discrimination in the +T+G+T/AGA LNA triplet
and in the isosequential DNA triplet
Values of ΔΔG° are positive because the lower stability of the mismatch makes ΔG°37 less negative. The larger the ΔΔG° values, the stronger the destabilization and mismatch discrimination. The positive difference between eqs 16 and 17 [ΔΔG°(LNA) – ΔΔG°(DNA)] indicates that LNAs increased the level of mismatch discrimination. The negative difference means that LNA modifications decreased the level of mismatch discrimination. We have predicted these free energy differences for the entire set of 192 possible mismatch triplets. Figure Figure44 shows the range of ΔΔG°(LNA) – ΔΔG°(DNA) values for each mismatch type. LNA modification enhances discrimination for 85% of sequences and weakens it for 8%. Free energy differences are insignificant, that is, between −0.2 and 0.2 kcal/mol, for 7% of mismatches.
Figure Figure44 shows that LNAs negatively impact discrimination of +G·T mismatches and some +C·A mismatches. It appears that base pairs flanking a mismatch affect discrimination as well. The +G·C base pairs adjacent to a mismatch decrease the level of discrimination, while +A·T or +T·A base pairs increase it. To quantify this effect, we averaged ΔΔG°(LNA) – ΔΔG°(DNA) differences over possible triplet sequences containing a specific flanking base pair. The order of increasing mismatch discrimination resulting from the flanking base pair is as follows: +G·C < +C·G < +A·T ≈ +T·A [with average ΔΔG°(LNA) – ΔΔG°(DNA) differences of 0.9, 1.2, 1.6, and 1.7 kcal/mol, respectively]. The effect is not dependent on the flanking base pair location because the base pairs on the 5′ side of the mismatch exhibit the same trend as the base pairs on the 3′ side. In agreement with these observations, +G+G+C/CTG and +G+C+G/CAC mismatches exhibit the largest decreases in the level of discrimination from DNA to LNA with ΔΔG°(LNA) – ΔΔG°(DNA) differences of −1.7 and −1.5 kcal/mol, respectively.
The largest increases in the level of mismatch discrimination, i.e., the most positive ΔΔG°(LNA) – ΔΔG°(DNA) differences, are seen for the +C·C mismatch in the +T+C+C/ACG triplet (3.4 kcal/mol), the +A·G mismatch in +T+A+G/AGC and +T+A+T/AGA (3.4 kcal/mol), and the +T·C mismatch in +A+T+C/TCG (3.2 kcal/mol). LNAs significantly enhance discrimination of all +G·G, +G·A, +A·A, and +T·T mismatches, as well.
The free energies in Figure Figure44 were calculated at 37 °C, the temperature of the human body. In some biological applications, for instance, polymerase chain reaction, oligonucleotides are annealed at higher temperatures. Analysis at 60 °C reveals a similar dependence of LNA discriminatory effects on mismatch type (data not shown). However, values of ΔΔG°(LNA) – ΔΔG°(DNA) increased by ~0.5 kcal/mol for the +G·T, +C·A, and +A·C mismatches. This result suggests that the positive effect of LNA on mismatch discrimination increases with temperature. For example, LNAs improve mismatch discrimination, in relative terms with respect to DNA, for half of +G·T mismatches at 60 °C, while such positive effects are rare at 37 °C.
In our analysis, we assumed negligible heat capacity effects (ΔCp ~ 0). This has also been assumed for previously published thermodynamic parameters, although recent comprehensive studies(47) detected small heat capacity changes, ~50 cal mol–1 K–1 bp–1. Because similar mismatch discrimination trends are predicted at different temperatures, the veracity of this assumption does not seem to seriously influence the results of mismatch analysis.
To test the accuracy of mismatch parameters, we measured the stability of LNA mismatches described in the previous paragraphs. Table Table44 lists sequences and their melting temperatures. Neither dye nor quencher was attached to these oligonucleotides. Their melting temperatures were determined using ultraviolet melting experiments. LNA modifications were predicted (1) to decrease the level of mismatch discrimination of VAL-A and VAL-B sequences, (2) not to affect discrimination of VAL-C and VAL-D sequences, and (3) to enhance mismatch discrimination of VAL-E, VAL-F, and VAL-G sequences. Considering limitations of the nearest-neighbor model,19,48 the predicted discrimination effects (ΔTm) agree with experimental measurements for all seven sequence sets.
New LNA mismatch parameters result in an average Tm prediction error of 2.9 °C for the sequences in Table Table4.4. The accuracy of DNA mismatch parameters (20) is the same. For DNA or LNA matched duplexes, average errors of predicted melting temperatures are less than 1.3 °C. The lower accuracy of mismatch predictions suggests that mismatched duplexes are more likely to deviate from assumptions of the nearest-neighbor model and two-state transitions. A small perturbation, like a single-nucleotide mismatch, does not usually break down assumptions of the nearest-neighbor model, but it may increase the magnitude of interactions propagating beyond nearest-neighbor nucleotides. These long-range interactions are often of electrostatic origin and likely become more significant in buffers with low counterion concentrations (<40 mM Na+). We expect the weaker H-bonding interactions and increased nucleobase flexibility at the mismatch site. This potentially decreases cooperativity and increases deviations from the two-state melting behavior.
The LNA modifications placed at every second or third nucleotide position are very effective in increasing the duplex stability and affinity for complementary targets.(42) Mismatch discrimination is improved most if the triplet of consecutive LNAs is centered on the mismatch site.(5) A single LNA modification usually discriminates less. We were therefore motivated to study thermodynamics of consecutive LNAs to expand the published nearest-neighbor model of single LNA modifications and improve our understanding of LNA·DNA duplex stability.
We employed the fluorescence melting method to measure the stability of modified oligonucleotide duplexes.(28) This new technology allows measurements for large sets of duplexes with unprecedented speed, and its accuracy is similar to the accuracy of the ultraviolet optical melting method. Using the fluorescence method, the experimental errors of ΔH°, ΔS°, ΔG°, and Tm were 8%, 9%, 4%, and 0.4 °C, respectively. If the duplex melts in the two-state manner, the thermodynamic values are in agreement between both methods. The transition enthalpies and entropies measured using the fluorescence differed by <4% from the values determined by the UV melting method.(28) The free energy values agreed within 2.5% when the optimal Texas Red–Iowa Black RQ pair was attached to the duplex terminus. These differences are similar or smaller than the errors seen in UV melting experiments where the errors of ΔH°, ΔS°, and ΔG° are ~8, ~8, and ~4%, respectively.17,41
The fluorescence melting method relies on the dye–quencher pair attached to one of the duplex termini as shown in Figure Figure1B.1B. When the duplex melts, the dye and the quencher dissociate, giving the increase in the magnitude of the fluorescence signal. Although the terminal dye–quencher pair stabilizes the duplex, it is attached to both the LNA-modified duplex and the core duplex. The Texas Red–Iowa Black RQ labels therefore change the ΔH°, ΔS°, and ΔG° values of both duplexes to the same amount. The thermodynamic impact of LNA modification is determined from the difference between the LNA-modified and core duplexes. We have shown previously that these thermodynamic differences (ΔΔH°, ΔΔS°, and ΔΔG°) are not affected by terminal labels.(28) The stabilizing effect of labels cancels out in this analysis.
Using SVD, we determined nearest-neighbor parameters for consecutive LNA·DNA base pairs. New parameters accurately predict melting temperatures of chimeric LNA·DNA duplexes. The average error was ~2 °C, which is the best accuracy that can be achieved by the nearest-neighbor model.19,48 If LNA modifications amount to a moderate perturbation of a DNA duplex, new parameters are most accurate. Analysis of the validation data set (Table S2 of the Supporting Information) suggests that accuracy decreases slightly as the percentage of LNA modifications increases. The duplexes of VAL-01–VAL-33 are predicted more accurately (average error of 1.5 °C) than VAL-34–VAL-53 duplexes (3.0 °C). The LNA content is low for the VAL-01–VAL-33 subset (20–25%) and varies from 30 to 60% for the latter subset.
We also predicted melting temperatures for 11 duplexes from published sources where one strand was LNA-modified from 89 to 100%. Initiation parameters for terminal LNAs were assumed to be identical to DNA initiation parameters.(16) Table S4 of the Supporting Information shows results. The average error of Tm predictions was higher for these duplexes (2.7 °C) than the error seen for the set of VAL-01–VAL-33 duplexes (1.5 °C). If an LNA strand is modified ≥50%, LNAs induce structural changes that could propagate beyond neighboring base pairs. In that case, the nearest-neighbor parameters and the model may be less accurate.
Thermodynamic parameters reveal the nature of stabilizing effects. The single strand to helix transition of nucleic acid is usually driven by favorable enthalpic changes associated with an increased level of stacking and H-bonding interactions. Entropic changes are unfavorable. Because single strands explore more degrees of freedom than the strands in the relatively stiff duplex structure, duplex formation incurs the entropic loss. Locked nucleic acids have been reported to alter both transition enthalpy and entropy,21,49 so the origin of LNA effects is uncertain.
The free energy change due to LNA residues can be divided into enthalpic (ΔΔH°) and entropic (−TΔΔS°) components, which are presented in the second and last columns, respectively, of Table Table1.1. The values suggest that the stabilizing effect is of enthalpic origin. Consecutive LNAs induce favorable changes in the transition enthalpy, making it more negative by −1 to −9 kcal/mol per each nearest-neighbor doublet. Changes in the entropic contribution to the free energy (the last column of Table Table1)1) are either unfavorable or negligible. The values of −TΔΔS° range from 0 to 7 kcal/mol at 37 °C and are smaller in magnitude than ΔΔH°. Thus, we conclude that the higher stability of consecutive LNA·DNA base pairs is mostly the result of favorable contributions to the transition enthalpy. This is the case for all nearest-neighbor doublets, confirming that enthalpy drives stabilization of consecutive LNAs regardless of base sequence.
These thermodynamic observations are related to structural changes. Stabilizing enthalpic effects of LNAs are equated with enhanced stacking interactions, potentially improved H-bonding of base pairs, and weakened hydration of the duplex state.(50) The LNA cytosine C5-methyl group, which is not present in the native DNA, may also increase the stacking energies due to additional van der Waals interactions with neighboring bases.(45) The entropic contributions of LNAs originate from backbone conformational preorganization, which is the result of restrictions of ribose flexibility in the C3′-endo (N-type) conformation.3,51 Because the modified ribose is similarly constrained in the single strand and in the duplex conformations, it has been argued that the smaller entropic loss occurs upon formation of LNA·DNA rather than DNA·DNA base pairs.
While we observe that the stabilization of consecutive LNAs is driven by enthalpic changes, McTigue et al. reported that the stabilizing effects of a single LNA modification are mostly entropic in origin.(21) Taken together, these findings suggest that the entropic changes characterized by restriction of nucleotide local conformations are achieved by the introduction of a single LNA nucleotide. Additional adjacent LNAs stabilize the duplex further by favorable enthalpic changes. This mechanism may explain the conflicting reports in the literature regarding the origin of LNA stabilization.
Structural studies have shown that both isolated and consecutive LNA residues restrict ribose conformation space and introduce structural changes in the double helix toward A-form. For example, LNAs widen the minor groove and decrease the value of the rise and the twist.51−54 The 1H NMR experiment with the C+TGA+TA+TGC sequence that contains only isolated LNA modifications failed to show significant changes in base stacking.(52) In contrast, LNAs in the C+TGC+T+TC+TGC sequence containing consecutive modifications enhanced base stacking.(53) Our fluorescence experiments using 2-aminopurine also detected enhanced stacking interactions in LNA triplets.(5) These apparent discrepancies can be reconciled assuming that the energetic impact of a single LNA in the duplex interior is dominated by entropic changes, and the subsequent addition of consecutive LNAs stabilizes duplexes by favorable enthalpic changes that are associated with enhanced stacking interactions.
Energetics of LNA modifications introduced at the duplex terminus may have a different character. Kaur et al. measured impacts of isolated LNA modifications at various positions.(49) The interior modifications decreased entropic loss in agreement with our rationale, but stabilizing effects of the terminal modification were driven by favorable enthalpic changes. We have not studied consecutive LNAs at the duplex terminus.
The A-form helical conformation that is preferred by LNA·DNA duplexes is also dominant in RNA·RNA and RNA·DNA duplexes. In fact, ribose puckering of the LNA·DNA duplex resembles closely the puckering of the RNA·DNA hybrid.(54) However, the structural similarity does not mean the same thermodynamic parameters. The LNA·DNA nearest-neighbor doublets are on average 1.4 kcal/mol more stable than RNA·DNA doublets.(19) For example, the ΔG°37 of +C+C/GG is −4.1 kcal/mol, while rCrC/GG is only half as stabilizing, −2.1 kcal/mol. The sequence dependence of parameters is also different. The least stable LNA doublet is +A+A/TT, while the rArA/TT doublet is more stable than five other RNA·DNA doublets. These significant differences reveal that thermodynamic parameters of RNA·DNA duplexes are not good approximations of LNA·DNA thermodynamics. The different composition of the ribose moiety, different patterns of hydration in the minor groove, the extra methyl group of +C, and subtle variations of the helical structure potentially explain these thermodynamic differences.
We show in Results that LNA·DNA and RNA·RNA mismatches exhibit a similar trend of stabilities, which deviates from the stability trend of DNA mismatches. To inquire whether the mismatch discrimination is similarly enhanced in RNA duplexes, like it is enhanced in LNAs, we predicted the free energy of mismatch discrimination (eq 16) for RNA, RNA·DNA, LNA, and DNA triplets. For each mismatch type, the ΔΔG° values were averaged over 16 possible triplet sequences containing the central mismatch. Predictions were based on the established nearest-neighbor parameters for matched LNA, DNA, and RNA base pairs (Table (Table22 and refs (16), (17), and (19)). For mismatches, the complete set of thermodynamic parameters is available for LNA·DNA and DNA·DNA pairs (Table (Table33 and ref (20)). Because parameters for many RNA·DNA mismatches are unknown, we averaged ΔΔG° for eight rG·T sequence contexts reported by the Sugimoto group(45) and predicted the average ΔΔG° values for rA·A, rG·G, and rC·C mismatches. Their parameters were recently determined.(46) The rA·rA, rG·rG, and rC·rC RNA·RNA mismatches were approximated by the algorithm of Davis and Znosko.(44) Mathews, Sabina, Zuker, and Turner parameters were used for the rG·rU mismatch.(18) The RNA calculations were conducted with MELTING version 5.0.3.(55)
Figure Figure55 shows the average free energies of duplex destabilization due to a mismatch. The general trend of increasing discriminatory power for the A·A, G·G, and C·C mismatches is as follows: DNA·DNA RNA·DNA < RNA·RNA ≤ LNA·DNA. These mismatches destabilize the LNA·DNA and RNA·RNA duplexes more than the DNA·DNA duplexes. To a lesser degree, the level of mismatch discrimination also increases in RNA·DNA duplexes. The opposite trend is seen for the wobble G·T base pair. The DNA·DNA mismatch shows the strongest discrimination. The +G·T, rG·T, and rG·rU mismatches discriminate less.
This analysis suggests that the enhanced mismatch discrimination is not a unique property of locked nucleic acids but rather the result of structural changes of nucleic acids from B-form to A-form. DNA·DNA duplexes in water solutions are in B-like conformations. The RNA·DNA hybrids fold into structures that are intermediates of A- and B-forms. The RNA·RNA duplexes occur in the A-form conformation, which is also the structure of LNA·DNA base pairs.53,54 As the conformational equilibrium is shifted toward the A-form, the level of mismatch discrimination increases. This is likely the result of energetic changes in stacking interactions, H-bonding of base pairs, and hydration envelope when the duplex turns to the A-like conformation.
The one significant structural change from B-form to A-form is the compaction of the rise between base pairs along the helical axis. The rise is significantly smaller in the A-form (0.26 nm) than in the B-form (0.34 nm). Because of the shorter distances, LNA nucleotides in the A-like structure may engage in stronger stacking interactions, which are disrupted by mismatches. If our hypothesis is correct, the enhancements of mismatch discrimination can be expected for any modification that shifts the conformation equilibrium from B-form to A-form, e.g., 2′-O-methyl-RNA, 2′-O-[2-(methoxy)ethyl]-RNA, 2′-deoxy-2′-fluoro-RNA, and N3′→P5′-phosphoramidate-DNA.56−59 As discussed earlier, +G·T mismatches are the exception; their level of mismatch discrimination decreases when LNA-modified guanine is introduced at the mismatch site. This could be a result of improved stacking interactions of guanine with neighboring bases. These stacking interactions are not significantly weakened by a thymine mismatch because the G·T pair is stabilized by two hydrogen bonds and is well-stacked in the duplex structure. Small pyridine bases are expected to stack less than large purine bases. This may explain the opposite discriminatory effects of LNAs in +T·G versus +G·T mismatches.
Chemical differences among LNA, RNA, and DNA are the composition and conformation of the ribose moiety. Another difference is the C5-methyl group in pyrimidine nucleobases. In RNA, uracil and cytosine are typically unmethylated. In DNA, thymine is C5-methylated and cytosine is not. In LNA nucleotides, both thymine and cytosine are C5-methylated.
Wang and Kool investigated thermodynamic effects of C5-methyl and 2′-OH groups in DNA and RNA duplexes.(60) The methyl group stabilized duplexes on average by 0.25 kcal/mol, and its effects on ΔG° were largely independent of 2′-hydroxyl effects. The C5-methyl appeared to enhance base stacking. Ziomek et al. studied 5-alkyl and 5-halogen analogues of uracil in (rArUrCrUrArGrArU)2 duplexes.(61) The methyl group stabilized the RNA duplex slightly (ΔΔG° < 0.1 kcal/mol). Sugimoto et al. examined thermodynamics of pyridine methyl groups in RNA·DNA mismatches.(45) The rG·dU mismatches were found to be less stable than rG·dT mismatches regardless of neighboring sequence context. The free energy contribution of the thymine C5-methyl was estimated to vary from 0.1 to 0.5 kcal/mol. The methyl moiety likely has a similar thermodynamic impact on the LNA cytosine residue.(62)
The extra methyl group of pyrimidines is not the driver of mismatch discrimination trends. The increase in the level of discrimination occurs in purine mismatches (+A·A and +G·G) and in the sequence contexts that do not contain methylated LNA cytosine. For example, LNAs increase free energies of +A·A mismatch discrimination in the center of the +G+A+G triplet by 1.0 kcal/mol. Further, relative to DNA, the extra C5-methyl group is present in LNA cytosine, but not in RNA cytosine. In both cases, the level of mismatch discrimination increases; i.e., both LNA and RNA duplexes have more discriminatory power than DNA. The presence of the C5-methyl group does not appear to be essential for discriminatory effects.
Sufficient mismatch discrimination is important for many oligonucleotide applications. Locked nucleic acids enhance discrimination due to two impacts. First, LNAs increase the stability of oligonucleotide probes. This allows the use of shorter sequences with more discriminatory power because the mismatch has a much larger impact on the duplex stability in shorter sequences than in longer ones.(5) This length effect is very significant in duplexes with <30 bp. The ΔΔG° and ΔTm differences between matched and single-base mismatched duplexes can double when the duplex length is decreased from 25 to 17 bp.
Second, locked nucleic acids can also increase specificity directly if they are located at or next to the mismatch site. We have discovered that the triplet of LNA residues containing the mismatch in the center has the largest discriminatory power; a single LNA modification usually discriminates less.(5) We therefore recommend using the LNA triplet at the mismatch site. This design will increase the discriminatory power for a majority of mismatches (in particular, for A·G, T·C, C·C, G·G, A·A, and T·T). New results also pinpoint several anomalies. LNAs in some +G·T and +C·A mismatches impact discrimination negatively. In these cases, is it not advised to introduce the LNA modifications at the mismatch site, but LNAs could be placed ≥2 bp from the mismatch to increase the stability of the probe–target duplex and make the probe shorter. The short probe will likely exhibit more discriminatory power. Alternatively, the probe could be redesigned to target the complementary strand if it is available in the biological sample. This will change the +G·T mismatch into the +T·G mismatch; the latter one is more likely to show positive effects of LNA on discrimination.
It is also important to optimize the location of mismatches within the probe. The mismatches at the terminus or adjacent to the terminus (penultimate mismatches) show significantly less discrimination than the mismatches in the duplex interior.5,20 It is preferable to place mismatches at least 3 bp from the termini of the probe–target duplex. Although the mismatch site in the center of the duplex maximizes the discrimination, it is not essential for the mismatch to be located exactly in the center of the oligonucleotide probe. As long as the mismatch is positioned in the interior of the duplex and not next to the termini, its discriminatory power (ΔΔG°) will be very close to the maximum.
To help design optimal LNA oligonucleotides, free software is available at the IDT websites http://biophysics.idtdna.com and http://www.idtdna.com. The web tools predict melting temperatures, free energies, and the extent of hybridization using the latest nearest-neighbor parameters, including parameters from Tables Tables22 and and3.3. It is important to enter conditions of the experiments (e.g., cation and DNA concentrations) to obtain the relevant predictions. Users can test effects of LNA modifications and mismatches at any location within their sequence. The potential LNA probes can be compared with unmodified probes to estimate benefits of modifications. The probes can be ranked by their mismatch discrimination energetics (ΔΔG° and ΔTm) and tuned to the hybridization temperature of a specific application. It is often optimal if the probe has a melting temperature 3–5 °C above the annealing temperature. The perfectly matched probe–target duplex will be stable, while the mismatched duplex is likely to be unstable under those conditions and will not give a false positive signal.
Many applications also require that chimeric oligonucleotides bind effectively and exclusively to DNA complements. The design must therefore exclude sequences that can form stable hairpins, dimers, and other self-folding structures. This is important because LNA·LNA base pairs are more stable than isosequential LNA·DNA base pairs.(63) Because thermodynamic parameters for LNA·LNA base pairs, LNA bulges, and hairpin loops are unknown, it is not currently possible to accurately predict the propensity of an LNA oligonucleotide to form self-folding structures. The simple approach is to avoid long stretches of consecutive LNAs. This approach makes stable LNA·LNA duplexes less likely to appear but also unnecessarily impedes probe design. Accurate predictions of LNA·LNA base pair stability would be useful.
The tendency of the base sequence to form hairpins can be estimated by the hairpin function of the IDT OligoAnalyzer tool.(64) The self-dimer function shows the potentially stable structures that can form between two molecules. The heterodimer function estimates interactions between the probe and the primers. If the predicted structure contains several consecutive LNA·LNA base pairs, it could be stable enough to compete with the formation of the probe–target duplex and the assay would be negatively impacted. For such sequences, a single LNA modification could be a better choice than the LNA triplet.
The design of real-time PCR hydrolysis probes (e.g., TaqMan probes) calls for additional considerations. This family of assays relies on the 5′ exonuclease activity of the polymerase, which degrades the probe and releases the dye attached to the 5′ terminus of the probe. Locked nucleic acids cannot be introduced at the 5′ terminus of the probe or at the adjacent nucleotide because they would increase nuclease resistance and interfere with the desired probe degradation.
Although the new parameter set is a significant addition toward a complete thermodynamic model of LNA modifications, parameters for some important LNA structures have yet to be determined (mismatches adjacent to a single LNA modification, LNA·LNA base pairs, bulges, and tandem mismatches). We also do not have parameters for LNAs at duplex termini, although such modifications are employed in PCR primers. The parameters in Table Table33 were determined for mismatches located in the interior of a duplex and will not be accurate at the terminus. The mismatch in the terminal or penultimate position often affects duplex stability less than the same mismatch located in the interior, i.e., ≥3 bp from the terminus of the duplex.(20)
We thank Derek M. Thomas for assistance with capillary electrophoresis and mass spectroscopy tests of oligonucleotide samples.
National Institutes of Health, United States
This work was supported by Grant R43GM081959 from the National Institute of General Medical Sciences.
Thermodynamic values for studied duplexes, figures of 1/Tm versus ln Ct fits, unique thermodynamic parameters for triplets containing single-base mismatches, and melting temperatures of validation data sets. This material is available free of charge via the Internet at http://pubs.acs.org.