Long RNA aptamers (60–100 nucleotides long) are routinely synthesized using solid-phase phosphoroamidite chemistry in an automated process used for small-scale synthesis of oligonucleotides (REESE, 2005
). However, therapeutic uses require large-scale, high-quality, cGMP-grade synthesis. RNA aptamers of long length are still significantly difficult to synthesize under these conditions (REESE, 2005
). One solution to this problem has been extensive truncation of RNA aptamer sequences down to minimal functional sequences (<60
nt) after selection (Biesecker et al., 1999
; Rusconi et al., 2002
; Dassie et al., 2009
). As discussed above, this process is often time-consuming, arduous, and does not work for all aptamers. Another potential solution to this problem is the identification of shorter RNA aptamer sequences through the use of short RNA SELEX libraries.
In this study, we characterized the nucleotide composition of RNA aptamers derived from several independent selections carried out with a 51-nt-long RNA SELEX library (Sel2N20). The RNA aptamers isolated from this library are, on average, 20–30 nucleotides shorter than RNA aptamers generated from conventional SELEX libraries. High-throughput 454 sequencing combined with bioinformatics analysis was used to determine the nucleotide composition of the sequence pools from the various aptamer selections. This analysis revealed a bias toward pyrimidine-rich sequences ( and ) as a result of loss of adenines in the RNA pools when compared with the starting library (). This loss was already observed after 2 selection rounds () and was independent of a partitioning step (against target) since the loss also occurred when the partition step was omitted (nontargeted selection) (). The bias was only partially due to the 3:1 ratio of 2′-F pyrimidines to 2′-OH purines used in the transcription step as a selection with a 1:1 or 0.7:1 ratio of these nucleotides also exhibited a bias (). The selected pools also exhibited a greater predicted structural stability (lower minimum free energy) (), likely a direct consequence of adenine depletion.
In accordance with our observations, several groups have postulated that functional RNAs (both artificially selected and naturally occurring RNAs) have a more stable secondary structure than random RNA sequences (Le et al., 1989
; Chen et al., 1990
; Clote et al., 2005
). The reason for this is thought to be that functional RNAs depend on a defined secondary structure for function. Indeed, current in silico
prediction algorithms use structural stability as evidence of a functional RNA (Bejerano et al., 2004a
; Bonnet et al., 2004
; Washietl et al., 2005
). Like naturally occurring functional RNAs (eg, tRNAs, rRNAs, hammerhead ribozymes and miRNAs), RNA sequences from a SELEX experiment can be considered functional RNAs and thus are predicted to exhibit higher structural stability. Hermann and colleagues have further suggested that, in general, artificially selected RNA aptamers have significantly higher predicted structural stability compared with natural occurring nucleic acids (functional RNAs) (Hermann and Patel, 2000
). The reason for this difference was thought to be that selection of natural RNAs depends on multiple factors such as biological function as well as ligand binding, whereas, artificial RNA aptamers depend solely on ligand binding. RNAs with greater structural stability are postulated to be better at interlocking with their cognate target ligands. Therefore, because SELEX experiments are usually designed to isolate RNA aptamers with high affinity for their target, the selection pressure is thought to favor the isolation of RNAs with greater structural stability.
While binding to a target ligand may play a role in selecting for RNAs with overall higher structural stability it is unlikely to be the only contributing factor. Indeed, we observed that selected sequences from the nontarget selections also display more structural stability compared to the random Sel2N20
RNA sequence library (). Thus it is likely that factors intrinsic to the selection process (eg, mutant T7 RNA polymerase, RT-PCR, nature of the SELEX library) may also influence the structural composition of the selected RNAs. We have investigated the transcription efficiencies of synthetic oligonucleotide templates with either a low adenosine (A) content or a high adenosine (A) content (). Interestingly, our data suggest that the template's adenosine content does not influence the transcription efficiency of either the mutant T7 RNA polymerase or the RT enzyme and, therefore, does not dictate the observed nucleotide bias. Alternative explanations for the nucleotide bias that we have not ruled out include biases in nucleotide misincorporations by these enzymes. For example, the rate of base mispair insertion of various RT (eg, HIV RT, AMV RT, and M-MLV RT) may favor the insertion of one base over another resulting in a nucleotide bias. Indeed, the fidelity-rates of RT have been implicated as the possible reason behind the genetic variability and convergence observed with retroviral genomes (Roberts et al., 1988
; Yu and Goodman, 1992
Another possible explanation for the observed nucleotide bias is the nature of the RNA library (eg, influence of the fixed regions or an unforeseen consequence of a short variable region). To this end, Zimmermann et al. (2010
) observed a shift toward less stable predicted secondary structures when performing both targeted and nontargeted SELEX with an RNA library derived from the E. coli
genome (genome-SELEX). Unlike the Sel2N20
library, the complexity of the genome-SELEX library is constrained by the E. coli
genome and the length of the library is highly variable (>60
nt). The authors of the study also reasoned that factors intrinsic to the selection process (amplification steps or nature of SELEX library) may favor the propagation of certain RNA sequences over others.
While the SELEX process with the Sel2N20 RNA library yields RNAs with overall predicted increased structural stability and ease of chemical-synthesis, a potential concern is that a nonspecific bias is influencing the selections, potentially reducing the likelihood that the optimal sequences are ultimately identified. In general, the nonspecific propagation of sequences in rounds reduces the SELEX efficiency. However, we have identified several useful aptamers with the Sel2N20 library using similar protocols (our unpublished data), suggesting that the bias is not so severe as to result in the necessary failure of aptamer selections. Indeed, similar bias is likely to be present in successful selections described by others, but unnoticed because of the limited number of sequences that were determined. It is expected that the identification of such nonspecific influences will provide the understanding with which the SELEX process can be optimized into a more robust methodology.