PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of jbtJBT IndexAssociation Homepage
 
J Biomol Tech. 2008 December; 19(5): 335–341.
PMCID: PMC2628072

Effect of Primer Proximity to a Difficult-to-Sequence Region on Read Length and Sequence Quality

Abstract

Anecdotal and not well-established evidence implies that there could be some effect of primer proximity in relation to a difficult region on read length and sequence quality. In this paper we sequenced many different categories of difficult regions where primers were located at various distances in relation to such regions and we found that there is only weak, if any, correlation between primer proximity and read length or sequence quality. The occasional improvements observed in some studies could be related instead to more optimal primers or better quality DNA. We suggest that instead of trying to design primers at varying distances to a difficult region, sequence finishers concentrate on applying modified chemistries appropriate to a given difficult region.

Keywords: difficult template region, DNA sequencing, modified sequencing chemistry, primer proximity

INTRODUCTION

Despite tremendous advances using next generation sequencing technologies,13 the elucidation of DNA sequence by the Sanger protocol4 is still the preferred method of choice in most DNA core facilities as well as in small and big sequencing centers. Advances in sequencing chemistries,58 optimization of auxiliary protocols,912 and improvements in instrumentation have made this technology flexible, reliable, and easy to use. Assuming that the quality and the quantity of the DNA preparation are acceptable, one can easily obtain over 900 bases of good quality for most nondifficult templates. However, if the DNA template is difficult—defined as one that cannot be sequenced using the standard ABI-like protocol5—more advanced protocols are needed to get clean read through. In the last few years, significant progress has been made in sequencing through many kinds of difficult templates.1322 However, one almost unexplored aspect of sequencing through any difficult region is the effect of primer proximity to a difficult region on the read length. Currently, to our best knowledge, only anecdotal evidence exists (e.g., Ref. 23, personal communications) that there could be some effect of primer proximity to a difficult region on the read length and sequence quality. In this paper, we systematically explore the effect of primer proximity in relation to a number of difficult regions on the ability to obtain clean and long read lengths through such regions.

MATERIALS AND METHODS

Fourteen DNA templates used throughout this study contained a variety of difficult-to-sequence regions and were primarily collected through standard submission of sequencing requests to a DNA sequencing group at Wyeth, Cambridge. All of these templates were prepared using Marligen’s PowerPrep HP Plasmid Maxiprep System (Ijamsville, MD) and some DNAs were also prepared using Sequence Resolver Kit.11

DNA sequencing (in triplicates for each primer), cleanup of sequencing reactions, and electrophoresis were carried out as described before.14,15 Modifications to a standard DNA sequencing protocol are described in the legend to Figure 6. All dye terminator mixes were purchased from Applied Biosystems (Foster City, CA) and betaine was from Sigma (Sigma-Aldrich, St. Louis, MI). Data were analyzed using Sequencher program (Gene Codes, Ann Arbor, MI), and for assembly into contigs, only traces with a median read length, out of three, were used.

FIGURE 6FIGURE 6
Effect of various modifications of sequencing protocol and different methods of template preparation on read length. a and b show Q ≥ 20 read length in forward and reverse direction, respectively. Note that in A no data were obtained for DNA 2 ...

Primer selection for this study was greatly facilitated using the Find Primer algorithm which is part of the DNA sequencing LIMS developed at the Wyeth core facility.2426 Briefly, this algorithm matches all primers available in our library against a reference sequence, and positions and orientations of found primers are displayed. If needed, new primers can be designed at specified intervals by using another algorithm developed at the Wyeth core facility. An example of such a primer match is shown in Figure 1. In each case presented in this paper, several primers (from 3 to 28) were selected on both sides of a difficult region (Table 1). To predict various potentially difficult-to-sequence regions in templates, we developed the “Examine Repeats” algorithm26 which can calculate up to seven various structures. In addition, the GC module calculates GC content in a reference sequence at specified intervals (Fig. 1). Examples of such predictions are shown in Figures 2 and and33.

FIGURE 1
Matching primers to a reference sequence using “Find Primer” module. The database finds perfect matches of all existing primers against provided reference sequence. The matching can occur over the entire length or any portion of a reference ...
FIGURE 2
Various Potentially Difficult-to-Sequence Motifs in DNA 2a
FIGURE 3
Various Potentially Difficult-to-Sequence Motifs in -DNA 6a
TABLE 1
DNA Templates: Characteristics of Difficult Regionsa

Note: All primers used in this study passed primer design criteria as specified by Primer Designer software from Scientific and Educational Software (Cary, NC); Tm 54–70°C, GC% = 55±10, stability > 1.3 kcal/mole (3′ vs 5′), matches at 3′ end < 3, hairpin separation < 7, base runs < 4, adjacent homologous bases < 7, and repeats:dinucleotide pairs < 3.

RESULTS AND DISCUSSION

The characteristic of difficult regions in each of the 14 templates used in this study, as well as the number of primers used in forward and reverse directions, is presented in Table 1. The forward/reverse range indicates the distance (in bases) from the 3′ end of sequencing primers to the beginning of a difficult region.

There are two general cases observed in this study. Case 1: The forward and reverse reads stop at the beginning of a difficult motif (DNAs 4–6) or at some distance into such aregion (DNAs 2, 13) without completely getting through, with the consequence that there is no assembly into a single contig. It is obvious that in this case read length is dependent on the distance of a primer from a difficult region, but in no situation was it possible to sequence through such a region. Case 2: The forward and reverse primers read through a difficult region and assemble into a single contig (all other DNAs in this study). Reads are somewhat shorter (with few exceptions) compared with typical read lengths of over 900 bases, and relatively small standard deviations (1–15% with median of about 4.5%) for reads in either forward or reverse directions indicates the lack of significant effect of primer position on the ability to obtain better quality and longer reads. Figure 4 shows an example of Sequencher assembly for DNA template 5 containing a strong 24-base hairpin. All 11 sequences, regardless of the distance to the hairpin, terminated at the beginning of a hairpin and did not overlap with sequences generated using reverse primers (not shown here). In Figure 5 (DNA 8) the forward and reverse reads assemble into a single contig but there is no significant effect of primer proximity to CA/GT dinucleotide repeats on the read length. Table 2 shows individual Q 20 read-length values corresponding to data presented in Figure 5.

FIGURE 4FIGURE 4
Termination of sequencing traces at the beginning of a hairpin in DNA 5. a: This vector was purchased from Invitrogen and routinely any new vector is re-sequenced. Sequencing verification revealed the presence of a 27-base-pair insertion that wasn’t ...
FIGURE 5
Assembly of sequencing traces from forward and reverse directions in DNA 8. The top part of the figure shows an overview of trace alignment. Above each line is the trace description with last part indicating the position of a trace with respect to the ...
TABLE 2
Individual Q>20 Read Length Values Corresponding to Chromatograms Shown in Figure 5a

In all cases presented in this work (107 forward and 150 reverse primers tested on 14 different difficult templates), we did not observe any significant effect of primer proximity to a difficult region on the ability to read through a difficult region (in DNAs for case 1) or on the substantially increased read lengths and better quality for DNAs representing case 2. A much better option to successfully sequence through any kind of difficult template is to use modified chemistry,14,15 as shown in Figure 6A,B, or a template that was prepared with a different preparation method.11,27 The data in this figure show the significant variations (for the same primer) of read length depending on the type of chemistry used. It is also evident that the most optimal type of chemistry depends on the direction of sequencing. This phenomenon is explored more deeply in an upcoming paper based on the interlaboratory study conducted by the DNA Sequencing Research Group on a much larger set of difficult templates (J. Kieleczawa et al., accepted for publication in JBT).

ACKNOWLEDGMENTS

I wish to thank Drs. L. Bloom and B. Ulmer for critical reading and numerous suggestions during the preparation of this manuscript.

REFERENCES

1. Margulies M, Egholm M, Altman WE, et al. Genome-sequencing in micro-fabricated high-density picolitre reactors. Nature. 2005;437:376–380. [PMC free article] [PubMed]
2. Bentley DR. Whole genome re-sequencing. Curr Opin Genet Dev. 2006;16:545–552. [PubMed]
3. McLaughlin SF, Peckham HE, Zhang ZH, et al. Whole-genome resequencing with short reads: Accurate mutation discovery with mate pairs and quality values. 2007 AGBT Conference; Marco Island, FL. Poster 2620.
4. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977;74:5463–5467. [PubMed]
5. ABI PRISM® BigDye Terminator v3.1 Cycle Sequencing Kit. 2002 Protocol. Part number 4337035 Rev. A. Applied Biosystems, Foster City, CA.
6. Automated DNA Sequencing. Chemistry Guide. Document number 4305080B. 2000. Applied Biosystems, Foster City, CA.
7. Azadan RJ, Fogleman JC, Danielson PB. Capillary electrophoresis sequencing: Maximum read length at minimal cost. BioTechniques. 2002;32:24–28. [PubMed]
8. Brandis J, Bloom C, Richards JH. DNA polymerases having improved labeled nucleotide incorporation properties. 2001. US Patent 6,265,193.
9. Kieleczawa J, editor. DNA Sequencing: Optimizing the Process and Analysis. Sudbury, MA: Jones and Bartlett; 2005.
10. Kieleczawa J, editor. DNA Sequencing II: Optimizing Preparation and Cleanup. Sudbury, MA: Jones & Bartlett; 2006.
11. GE Healthcare. Sequence Finishing Kit. Product Code 25-6401-01, 2003.
12. Murray V. Improved double-stranded DNA sequencing using the linear polymerase chain reaction. Nucleic Acids Res. 1989;17:8889. [PMC free article] [PubMed]
13. Adams PS, Dolejsi MK, Hardin S, et al. DNA sequencing of a moderately difficult template: Evaluation of the results from a Thermus thermophilus unknown test sample. BioTechniques. 1996;21:678. [PubMed]
14. Kieleczawa J. Simple modifications of the standard DNA sequencing protocol allow for sequencing through siRNA hairpins and other repeats. J Biomol Tech. 2005;16:220–223. [PMC free article] [PubMed]
15. Kieleczawa J. Fundamentals of sequencing of difficult templates-an overview. J Biomol Tech. 2006;17:207–217. [PMC free article] [PubMed]
16. Gerstner A, Sasvari-Szekely M, Kalasz H, Guttman A. Sequencing difficult DNA templates using membrane-mediated loading with hot sample application. BioTechniques. 2000;28:628–630. [PubMed]
17. Hawes JW, et al. Sequencing through difficult repetitive sequence. Results from the ABRF DNA Sequence Research Group Study. J Biomol Tech. 2003. www.abrf.org/Research-Groups/DNASequencing/DSRG2003Study.
18. Ducat DC, Herrera FJ, Triezenberg SJ. Overcoming obstacles in DNA sequencing of expression plasmids for short interfering RNAs. BioTechniques. 2003;34:1140–1144. [PMC free article] [PubMed]
19. Esposito D, Gillette W, Hartley JL. Blocking oligonucleotides improve sequencing through inverted repeats. BioTechniques. 2003;35:914–920. [PubMed]
20. Langan JE, Rowbottom L, Liloglou T, Field JK, Risk JM. Sequencing of difficult templates containing poly (A/T) tracts: Closure of sequencing gaps. BioTechniques. 2002;33:276–280. [PubMed]
21. Thomas MG, Hesse SA, McKie AT, Farzaneh F. Sequencing of cDNA using anchored oligo dT primers. Nucleic Acid Res. 1993;21:3915–3916. [PMC free article] [PubMed]
22. Zhao X, Haqqi T, Yadav SP. Sequencing telomeric DNA templates with short tandem repeats using dye terminator cycle sequencing. J Biomol Tech. 2000;11:111–121. [PMC free article] [PubMed]
23. Yang A. Solutions for sequencing difficult regions. In: Kieleczawa J, editor. DNA Sequencing III: Dealing with Difficult Templates. Sudbury, MA: Jones & Bartlett; 2008. pp. 65–90.
24. Koffman D, Sookdeo H. DNA sequencing database: A flexible LIMS for DNA sequencing analysis. In: Kieleczawa J, editor. DNA Sequencing: Optimizing the Process and Analysis. Sudbury, MA: Jones & Bartlett; 2005. pp. 143–156.
25. Kieleczawa J, Atnoor D, Carmical M, et al. Essential software and other tools used in modern biology laboratories. In: Kieleczawa J, editor. DNA Sequencing II: Optimizing Preparation and Cleanup. Sudbury, MA: Jones & Bartlett; 2006. pp. 313–353.
26. Kieleczawa J, Lakshmanan B, Koffman D, Kitzmiller A. Bio-informatics tools to aid sequencing of difficult templates. In: Kieleczawa J, editor. DNA Sequencing III: Dealing with Difficult Templates. Jones & Bartlett; Sudbury, MA: 2008. pp. 163–177.
27. Kieleczawa J, Wu P. Preparation of difficult DNA templates using seven different commercial methods. In: Kieleczawa J, editor. DNA Sequencing II: Optimizing Preparation and Cleanup. Jones & Bartlett; Sudbury, MA: 2006. pp. 1–14.

Articles from Journal of Biomolecular Techniques : JBT are provided here courtesy of The Association of Biomolecular Resource Facilities