Oligonucleotides were synthesized
at Integrated DNA Technologies, purified by HPLC,(

29) and dialyzed against storage buffer [10 mM Tris-HCl and
0.1
mM Na

_{2}EDTA (pH 7.5)].(

28) Concentrated
oligonucleotide samples were tested by mass spectroscopy (molecular
weights were within 2 g/mol) and capillary electrophoresis (>90%
pure). DNA concentrations were determined from predicted extinction
coefficients (ε) and sample absorbance at 260 nm using the Beer–Lambert
law.

^{29,30} LNA nucleotides were assumed to possess
the same extinction coefficients as DNA ones. Coefficients of Texas
Red (14400 L mol

^{–1} cm

^{–1}) or Iowa
Black RQ (44510 L mol

^{–1} cm

^{–1})
were added to the ε of labeled oligonucleotides.

Primary Set of Oligonucleotides

Figure A shows the sequences studied. Fluorescent Texas Red dye (TXRD)
is attached at the 5′ end of the top strand, and Iowa Black
RQ quencher (IBRQ) is attached at 3′ end of the complementary
strand. This design efficiently quenches fluorescence when the strands
are annealed because the dye and the quencher are in close contact.
We use notation of oligonucleotide manufacturers; LNA nucleotides
are indicated with + in front of the base symbol (e.g., +A denotes
an adenine LNA nucleotide). Cytosine of +C is 5-methylated because
oligonucleotide manufacturers usually synthesize the methylated version
of LNA cytosine.

The set of DNA duplexes contains a triplet of consecutive
LNAs located either in the interior of the strand labeled with Texas
Red or in the interior of the complementary strand labeled with Iowa
Black RQ. Eight possible LNA·DNA base pairs (X·Y

+A·T, A·+T, +T·A, T·+A, +C·G, C·+G,
+G·C, and G·+C), and 24 mismatches were introduced at the
X·Y site. Core duplexes were also measured. They contained DNA·DNA base pairs
(X·Y

A·T, T·A, C·G, and G·C), and
the same terminal Texas Red–Iowa Black RQ pair was also measured.
This design is economical; each oligonucleotide is used in several
duplexes. Thirty-six duplexes were melted for each set except for
set 3. This set consisted of 27 duplexes because its two sequences,
GTAGGGGTGCT-IBRQ and GTA+G+G+GGTGCT-IBRQ, were not obtained with sufficient
purity.

For sets 1–4, the same base flanks the X·Y
site on the 5′ and 3′ sides. For sets 5–8, the
flanking bases are different and each of the four bases (A, T, C,
and G) occurs once on the 5′ and 3′ sides of the X·Y
base pairs. This design ensures that every possible nearest-neighbor
interaction is present several times within the data set.

Figure A also shows that duplex lengths range from 10 to
12 bp. Such short sequences are likely to melt in a two-state manner.
Nevertheless, non-two-state behavior may occur even for short oligonucleotides
if they form stable self-complementary structures, e.g., hairpins
or dimers. OligoAnalyzer version 3.1 (

http://www.idtdna.com/analyzer/Applications/OligoAnalyzer/) confirmed that our sequences do not form such structures.

This paper follows previous conventions to represent duplex sequences.(

16) A slash divides the strands in an antiparallel
orientation. The sequence is oriented 5′ to 3′ before
the slash and 3′ to 5′ after the slash (for example,
CA/GT represents the 5′-CA-3′/3′-GT-5′
doublet with two Watson–Crick base pairs). Mismatched nucleotides
are underlined or colored red. Ribonucleotides are distinguished from
deoxyribonucleotides by the “r” prefix, e.g., rA.

Melting Experiments

We followed our previously described
method for fluorescence melting experiments.(

28) The melting buffer contained 1 M NaCl, 3.87 mM NaH

_{2}PO

_{4}, 6.13 mM Na

_{2}HPO

_{4}, and 1 mM Na

_{2}EDTA and was adjusted to pH 7.0 with 1 M NaOH.(

30) Buffer reagents of p.a. grade purity were bought from ThermoFisher
Scientific (Pittsburgh, PA).

Melting experiments were performed
at 13 different total single-strand concentrations (19, 30, 46, 70,
110, 160, 250, 375, 570, and 870 nM and 1.3, 2.0, and 3.0 μM).
Duplex samples were prepared at the highest *C*_{t} of 3 μM. Complementary oligonucleotides were mixed
in a 1:1 ratio in the melting buffer, heated to 95 °C, and slowly
cooled to room temperature. Aliquots of the 3 μM solution were
diluted with the melting buffer to make 12 remaining samples. Low-binding
Costar microcentrifuge tubes (catalog no. 3207, Corning, Wilkes Barre,
PA) were used to reduce the level of binding of oligonucleotides to
the tube surface.

We pipetted 25 μL of the melting sample
into two wells of a 96-well PCR plate (Extreme Uniform Thin Wall Plates,
catalog no. B70501, BIOplastics BV, Landgraaf, The Netherlands). A
significant discrepancy between wells alerted us to an erroneous measurement.
Using the Bio-Rad iQ5 real-time PCR system, the fluorescence signal
in the Texas Red channel was recorded every 0.2 °C while the
temperature was increased from 4 to 98 °C and decreased back
to 4 °C over two cycles. Subsequent temperature cycles were not
used because they were unreliable;

*T*_{m} sometimes
increased, indicating the evaporation of water or degradation of dye.
The iQ5 system maintained a temperature rate of 25 °C/h. Analysis
was conducted in Microsoft Excel. We programmed VBA software to automate
melting profile analysis, including baseline selection using a second-derivative
algorithm.(

28) The fraction θ was calculated
[θ = (

*F* –

*F*_{L})/(

*F*_{U} –

*F*_{L})] from the fluorescence of the DNA sample (

*F*), the fluorescence of the upper linear baseline (

*F*_{U}), and the fluorescence of the lower linear baseline
(

*F*_{L}). If a duplex melts in a two-state
manner, dissociation of the fluorophore from the quencher is likely
coupled to the duplex-to-single strand melting transition and θ
represents the fraction of melted duplexes.

The melting temperature
was defined as the temperature at which θ = ^{1}/_{2}. The average standard deviation of *T*_{m} values was 0.4 °C. Transition enthalpies, entropies,
and free energies were determined from fits to individual melting
profiles and from the dependence of melting temperature on DNA concentration.^{14,28,31,32} These two analytical methods assume that melting transitions proceed
in a two-state manner; that is, intact duplex and unhybridized single
strands are dominant, and partially melted duplexes are negligible
throughout the melting transition. The methods also assume that transition
enthalpies and entropies are temperature-independent. If Δ*H*° or Δ*S*° values differed
more than 15% between these two methods, the duplex did not melt in
a two-state manner.^{28,32,33} In that case, we excluded Δ*H*° or Δ*S*° values from further analysis because they were inaccurate.

Stabilizing Effects of LNA Modifications

Locked nucleic
acids increase duplex stability and alter the melting transition enthalpy,
entropy, and free energy. As shown in Figure B, we determined these LNA contributions (ΔΔ

*H*°, ΔΔ

*S*°, and ΔΔ

*G*°

_{37}) from the difference between LNA-modified
and core duplexes.(

28) LNA modifications
were located at least five nucleotides from the terminal fluorophore
and the quencher. In this design, terminal labels do not interact
with LNAs and do not influence differential thermodynamic values between
modified and core duplexes.

Figure B
shows an example of the analysis for the Set1–11 duplex. Entering
Δ

*H*° from Table S1 of the

Supporting Information, we determined the experimentally measured
differential enthalpic change [ΔΔ

*H*°(A+T+G+TC/TACAG)]
to be −97.6 – (−86.4) = −11.2 kcal/mol.
In the nearest-neighbor model, this enthalpic contribution is a sum
of enthalpies of base pair doublets

Rearrangement of eq

3 places unknown LNA parameters on the left side

The right side of eq

4 contains the experimentally measured enthalpic change and two previously
determined nearest-neighbor parameters.(

21) McTigue, Peterson, and Kahn investigated the thermodynamics of interactions
between LNA·DNA and DNA·DNA base pairs. We used their parameters
to account for LNA–DNA interactions that occur in the beginning
and at the end of a section of consecutive LNAs. Parameters from their
32NN set (Table 4 of ref (

21)) were entered into eq

4A similar equation was constructed for each
LNA duplex. Analogous equations were set up for ΔΔ

*S*° and ΔΔ

*G*°

_{37} contributions.

Determination of LNA Nearest-Neighbor Parameters

Selecting
two bases from the set of four (A, T, C, and G) with replacement leads
to the creation of 16 nearest-neighbor doublets.(

34) Because antiparallel strands of native DNA duplexes exhibit
structural symmetry, some doublet sequences are identical, e.g., AC/TG
and GT/CA. Therefore, 10 nearest-neighbor parameters are sufficient
to represent internal DNA·DNA doublets. No such symmetry exists
for LNA·DNA base pairs. The +A+C/TG doublet differs from the
+G+T/CA doublet. Sixteen nearest-neighbor parameters are needed for
consecutive LNA·DNA base pairs.

We measured 62 perfectly
matched LNA duplexes. Sixty of them melted in a two-state manner.
Their thermodynamic values were used to determine the parameters.
Each of the 16 LNA doublets was well represented in this data set
with the following numbers of occurrences: 8 +A+A/TT, 8 +A+C/TG, 8
+A+G/TC, 8 +A+T/TA, 8 +C+A/GT, 6 +C+C/GG, 7 +C+G/GC, 8 +C+T/GA, 8
+G+A/CT, 7 +G+C/CG, 4 +G+G/CC, 8 +G+T/CA, 8 +T+A/AT, 8 +T+C/AG, 8
+T+G/AC, and 8 +T+T/AA.

First, we examined enthalpic effects.
Equation

5 was constructed for each LNA duplex.
This thermodynamic
analysis produced the set of 60 linear equations

where

**M** is a 60 × 16 matrix
of the number of occurrences for each LNA nearest-neighbor doublet
in 60 duplexes,

**H**^{n–n} is the vector
of 16 unknown parameters, and

**H**^{exp} is the
column vector of experimentally measured enthalpic contributions.
The parameters reported by McTique et al. were subtracted from the
enthalpic contributions as shown in eqs

4 and

5. Because the number of unknown parameters (16) was less than the number of equations (60), eq

6 was overdetermined.(

35)We solved it using singular-value decomposition
(SVD)(

36) by minimizing χ

^{2 }where

**σ**_{H} is
the diagonal matrix whose elements are experimental errors of ΔΔ

*H*°. Because these errors were similar, they were set
to a constant value of 3 kcal/mol and the SVD fit was not error weighted.
Singular-value decomposition was conducted using Microsoft Excel Add-in,
Matrix.xla package, version 2.3.2 (Foxes Team, L. Volpi,

http://digilander.libero.it/foxes). Calculations were repeated using the Excel LINEST function, yielding
the same values. We also examined matrix

**M** for degeneracies.
The rank of matrix

**M** was 16. Because the rank was equal
to the number of unknown parameters, the matrix had no singular values,
and the parameters were unique and linearly independent.

^{34−36}Next, we replaced ΔΔ

*H*° values
with ΔΔ

*S*° or ΔΔ

*G*°

_{37} values in eqs

5–

7. Analogous analyses gave us nearest-neighbor
parameters for entropies and free energies.

Error Analysis

Error estimates of parameters were obtained
from bootstrap simulations.(

37) These calculations
estimate the dependence of parameter values on the data set. Many
bootstrap data sets were created from the original data set. A different
value of the parameter was usually determined from each bootstrap
data
set. The bootstrap estimate of the parameter error is given by the
standard deviation of all these parameter values.

In our simulations,
the bootstrap data sets were the same size as the original data set;
i.e., the sets contained data from 60 duplexes. The duplex data were
randomly drawn, with replacement, from the original data set. This
means that the entire experimental data set was used in each drawing.
This procedure produced bootstrap data sets in which some duplex data
from the original data sets were present multiple times and other
data were not selected. We generated 5 × 10

^{4} bootstrap
data
sets. Equation

6 was solved for each data set
using SVD, and 16 parameters for the consecutive LNAs were determined.
If the rank of

**M** was less than 16, the particular bootstrap
data
set did not contain all possible nearest-neighbor doublet sequences.
Thermodynamic parameters could not be determined in this case; therefore,
the bootstrap set was excluded from analysis, and a replacement data
set was drawn. Fewer than 3% of the data sets were excluded. Standard
deviations and averages were calculated from bootstrap parameter estimates.
The average parameters determined from bootstrap analysis agreed with
the parameters determined from the original data set.

We have
analyzed the error in free energy calculated from entropic and enthalpic
contributions (ΔΔ

*G*° = ΔΔ

*H*° –

*T*ΔΔ

*S*°). Enthalpies and entropies of DNA melting transitions
are correlated.(

38) The errors of the enthalpic
contribution, σ(ΔΔ

*H*°), and
the entropic contribution, σ(ΔΔ

*S*°), are also highly correlated; their correlation coefficient
is usually above 0.99.

^{17,21} If the uncertainty in ΔΔ

*G*° is estimated by error propagation,(

39) the covariance cov(ΔΔ

*H*°,ΔΔ

*S*°) significantly decreases the error

Equation

8 indicates
that the free energy is determined more precisely than the enthalpic
or entropic contributions alone. The similar error compensation decreases
the error in the melting temperature calculated from Δ

*H*° and Δ

*S*°.(

17) This analysis demonstrates that it is useful to report
the ΔΔ

*H*° and ΔΔ

*S*° parameters in Tables – beyond their individual errors. If the parameters
are rounded to their error estimates, the calculated free energies
and melting temperatures may be less precise.

| **Table 1**Nearest-Neighbor Parameters for Differences
between LNA·DNA and DNA·DNA Base Pairs |

| **Table 3**Thermodynamic Parameters for LNA Single
Mismatches in 1 M Na^{+} |

Validation Melting Experiments

Validation sets were
measured by ultraviolet spectroscopy as previously described.(

30) Absorbance at 268 nm was recorded every 0.1
°C using a Beckman DU 650 spectrophotometer. The temperature
was changed at a rate of 25 °C/h in the range from 10 to 98 °C
using a high-performance temperature controller (Beckman-Coulter,
Brea, CA). Both heating and cooling melting profiles were collected.
Sloping baselines were subtracted from the melting profiles,(

30) and the melting temperature was defined as the
temperature at which the fraction of melted duplexes equaled 0.5.

Nearest-Neighbor Parameters for Single-Base Mismatches

There are 12 possible LNA·DNA mismatches (+A·A, +C·C, +G·G, +T·T, +A·C, +C·A, +A·G, +G·A, +C·T, +T·C, +G·T, and +T·G). In our design, mismatches
were located in the center of LNA triplets. Enthalpic, entropic, and
free energy effects were determined from the differences between the
energetics of LNA mismatch duplexes and core DNA duplexes.

As
an example, let us consider the Set1–17 duplex containing the

+G·G mismatch. The enthalpic contribution from the
A+T

+G+TC/TA

GAG duplex
subsequence is calculated from the difference in the total enthalpy
of the Set1–17 (TXRD-CGTCA+T

+G+TCGC)
and Set1–10 (TXRD-CGTCATGTCGC) duplexes (Table S1 of the

Supporting Information)

The nearest-neighbor model assumes that this
contribution is the sum of four nearest-neighbor doublets. We used
the parameters of McTigue et al.(

21) for
two doublets (A+T/TA and +TC/AG). An equation similar to eq

4 was constructed. The left side contains two unknown
parameters

Equation

10 was
built for each mismatched duplex. Sequences having the

+G·G mismatch were grouped into the subset. The
resulting system of linear equations was overdetermined and was solved
by SVD analysis. Eight unknown nearest-neighbor parameters (+A

+X/T

Y, +C

+X/G

Y, +G

+X/C

Y, +T

+X/A

Y,

+X+A/

YT,

+X+C/

YG,

+X+G/

YC, and

+X+T/

YA) were obtained (

+X
+G, and

Y G). This procedure was repeated
for 12

+X·Y mismatch types, and 96 parameters
(8
× 12) were determined. The thermodynamic values of LNA duplexes
containing mismatches can be predicted from eqs

1 and

2 using new parameters.

SVD analysis
indicated that the number of linearly independent equations and the
rank of matrix

**M** was seven for mismatches. Eight fitted
nearest-neighbor parameters are useful, but they are not a unique
solution

^{34,35,40} because a
constraint equation relates the numbers of eight doublets (

*N*)

The constraint decreases the number of unique
parameters to seven for each mismatch type. Equation

11 is valid for duplexes containing mismatches within consecutive
LNAs.

Unique, linearly independent parameters can be constructed
from linear combinations of eight nonunique parameters. The similar
constraint limits the number of unique parameters for some DNA mismatches.
Allawi and SantaLucia proposed seven linearly independent sequences
for DNA mismatches.(

41) They added a C·G
base pair to nonunique doublets to create linearly independent triplets.
Using a similar procedure, we have added the +C·G base pair to
LNA doublets and created seven unique triplets (+A

+X+C/T

YG, +C

+X+C/G

YG, +G

+X+A/C

YT, +G

+X+C/C

YG, +G

+X+G/C

YC, +G

+X+T/C

YA, and +T

+X+C/A

YG). A single LNA mismatch lies in the center. Using
SVD analysis, seven parameters for those triplets were determined
for each

+X·Y mismatch type (Table S3
of the

Supporting Information).

The
energetics of any LNA

+X·Y mismatch sequence
(+K

+X+M/L

YN) could also
be calculated from unique triplet parameters

The +K·L and +M·N pairs are Watson–Crick
base pairs adjacent to the mismatch. The numbers of seven triplets,

*N*(triplet), are related to the numbers of doublets:

*N*(+A

+X+C/T

YG) =

*N*(+A

+X/T

Y),

*N*(+C

+X+C/G

YG) =

*N*(+C

+X/G

Y),

*N*(+G

+X+A/C

YT) =

*N*(

+X+A/

YT),

*N*(+G

+X+C/C

YG) =

*N*(+G

+X/C

Y) –

*N*(

+X+A/

YT) –

*N*(

+X+G/

YC) –

*N*(

+X+T/

YA),

*N*(+G

+X+G/C

YC) =

*N*(

+X+G/

YC),

*N*(+G

+X+T/C

YA) =

*N*(

+X+T/

YA), and

*N*(+T

+X+C/A

YG) =

*N*(+T

+X/A

Y). The doublet and triplet
parameter sets predict identical thermodynamic values for any LNA
mismatch sequence. Both parameter sets are implementations of the
same nearest-neighbor model and do not take into account any next-nearest-neighbor
or longer interactions.