Search tips
Search criteria 


Logo of plosonePLoS OneView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
PLoS One. 2010; 5(5): e10577.
Published online 2010 May 11. doi:  10.1371/journal.pone.0010577
PMCID: PMC2868030

Restriction Site Extension PCR: A Novel Method for High-Throughput Characterization of Tagged DNA Fragments and Genome Walking

Alfredo Herrera-Estrella, Editor



Insertion mutant isolation and characterization are extremely valuable for linking genes to physiological function. Once an insertion mutant phenotype is identified, the challenge is to isolate the responsible gene. Multiple strategies have been employed to isolate unknown genomic DNA that flanks mutagenic insertions, however, all these methods suffer from limitations due to inefficient ligation steps, inclusion of restriction sites within the target DNA, and non-specific product generation. These limitations become close to insurmountable when the goal is to identify insertion sites in a high throughput manner.

Methodology/Principal Findings

We designed a novel strategy called Restriction Site Extension PCR (RSE-PCR) to efficiently conduct large-scale isolation of unknown genomic DNA fragments linked to DNA insertions. The strategy is a modified adaptor-mediated PCR without ligation. An adapter, with complementarity to the 3′ overhang of the endonuclease (KpnI, NsiI, PstI, or SacI) restricted DNA fragments, extends the 3′ end of the DNA fragments in the first cycle of the primary RSE-PCR. During subsequent PCR cycles and a second semi-nested PCR (secondary RSE-PCR), touchdown and two-step PCR are combined to increase the amplification specificity of target fragments. The efficiency and specificity was demonstrated in our characterization of 37 tex mutants of Arabidopsis. All the steps of RSE-PCR can be executed in a 96 well PCR plate. Finally, RSE-PCR serves as a successful alternative to Genome Walker as demonstrated by gene isolation from maize, a plant with a more complex genome than Arabidopsis.


RSE-PCR has high potential application in identifying tagged (T-DNA or transposon) sequence or walking from known DNA toward unknown regions in large-genome plants, with likely application in other organisms as well.


Linking gene identity to function is critical for genetic approaches to unravelling complex biological phenomena. For mutant genes disrupted by DNA insertions, the DNA insertion acts as a tag to enable the identification of the mutated gene. Obtaining flanking DNA is also valuable for isolating sequences upstream or downstream from a gene fragment. Unfortunately, especially for high-throughput screens, the current methods of isolating flanking DNA sequence, are less than ideal. There are three types of PCR-based techniques for walking in an unknown region from a known genomic fragment. The first type including TAIL-PCR uses nested specific primers from the ends of known region and degenerate primers that anneal randomly with the genome to obtain unknown flanking fragments [1], [2]. The second one usually digests the genomic DNA with a restriction enzyme to generate an overhang followed by ligation of a complementary adaptor. The primers derived from the adaptor and known sequence amplify the flanking sequences through successive rounds of PCR [3][12]. The third type such as inverse PCR (iPCR) begins with the digestion of genomic DNA with a restriction enzyme like the second one, however, subsequent intramolecular ligation generates a small DNA circle. Two primers designed in opposite direction from the known fragment could amplify the unknown junction region [13][18]. TAIL-PCR and iPCR have been used widely for identifying genes from Arabidopsis and rice. TAIL-PCR usually requires 3 rounds of amplification and special treatment of PCR samples before direct sequencing, and non-specific products are often a problem. iPCR requires sufficiently long sequences for two pairs of nested primers and presence of two appropriate restriction sites within an amplification range. Adaptor-based PCR usually suffers from non-specific amplification from the adaptor primers, and panhandle suppression may be inadequate especially when genome in question is very complex.

We have modified adaptor-based PCR such that non-specific amplification is reduced and ligation is avoided. Specifically we designed a novel PCR strategy called Restriction Site Extension PCR (RSE-PCR). Genomic DNA targets are specifically and efficiently amplified through two rounds of PCR. During the first cycle of the first round RSE-PCR (Primary RSE-PCR), a short extension of 5 seconds extends the 3′ end of the endonuclease restricted DNA fragments through a 5 bp terminal complementary to the 3′end of the 1st adaptor primer. Simultaneously, this 1st adaptor primer cannot complete its extension in such a short period along the majority of genomic DNA templates in the range of kilobases long. During the subsequent cycles and the second round of semi-nested RSE-PCR (Secondary RSE-PCR), touchdown and two-step PCR are combined to further enhance the amplification specificity.

The success of this novel strategy is demonstrated in the isolation of T-DNA flanking sequences from 23 out of 37 Arabidopsis mutants of interest, and unknown fragment for a particular gene of maize. The ease and specificity of RSE-PCR prove the efficacy of this approach toward high throughput application in genetics and genome walking in diverse organisms, including those of large complex genomes.

Materials and Methods

An ethics statement is not required for this work.

Genomic DNA isolation and restriction

Genomic DNA was isolated from young leaves of Arabidopsis and maize as described [19]. One µL (500 ng to 1 µg) of genomic DNA was digested with 10 units of restriction endonuclease generating 3′ overhangs in a 100 µL volume containing 1×BSA,1×buffer and 1 µl RNase A(10 µg/µL) for 3 hours under appropriate temperatures. The restriction endonucleases were subsequently heat inactivated.

PCR primers and conditions

All primers were synthesized by GenoMechanix (Gainesville, FL) or Invitrogen, and are summarized in Table 1. One microlitre of the above restricted genomic DNA was added to a 10 µL primary RSE-PCR reaction comprising 0.5 µL of each primer (10 µM, one is JL270, BIL1 or LB1; the other is AdKpnI, AdNsiI, AdPstI, or AdSacI), 1×PCR buffer, 0.5 µl of 50 mM MgCl2, 0.5 µL of dNTP (2.5 mM each), and 0.25 U of Platinum Taq Polymerase (Invitrogen). The primary RSE-PCR program was performed as shown in Table 2. 190 µL of autoclaved ddH2O was added to each sample to make a 20 fold dilution after amplification, from which 1 µL was removed for the secondary RSE-PCR. The secondary RSE-PCR contained the same ratio of reagents in a 20 µL volume except with the nested specific primer, such as JL202, BIL2 or LB2 from known sequences, and the 2nd general adaptor primer (AP), as described in Table 2.

Table 1
The oligonucleotides used in this study.
Table 2
Cycling parameters for RSE-PCR.

Gel analysis and DNA sequencing

5 µL of the secondary RSE-PCR products were loaded in 1.2% agarose gel stained with ethidium bromide in a 1×TAE or 1×TBE buffer and visualized under a UV illumination system. The remaining 15 µL PCR products were purified through Sephadex G-50 column and subject to sequencing (Lone Star Labs, Houston, TX).

Results and Discussion

1. Principle of RSE-PCR and optimization of PCR parameters

In 1993, Upcroft and Healey employed PCR priming from the SacI restricted Giardia duodenalis (an intestinal protozoan parasite, genome size = ~12 Mb) to successfully extend the 5′ flanking fragment of a drug resistance related gene [20]. Although there was no description of their PCR procedure, their idea could be extended and tested in large scale plant genetics. We designed the 1st adaptor primers containing a core part of 22 bp (GTAATACGACTCACTATAGGGC, a derivative from Genome Walker upper adaptor strand (Clontech) and a 3′ terminus of 5 bp (GTACC for KpnI, TGCAT for NsiI, TGCAG for PstI, and AGCTC for SacI) as shown in Table 1. Theoretically, the probability for a restriction site of a six base pair endonuclease is 1 out of every 46 (4096) base pairs, meaning that the average size of the restricted genomic DNA is about 4 Kb. If the sequence around the middle of a fragment is known, the isolation of its flanking 5′ and 3′ parts (about 2 Kb each) will be compatible with the amplification capacity of Platinum Taq Polymerase (Invitogen). The chance of successful isolation will be further increased through separate digestions with four different endonucleases.

During the primary RSE-PCR, a 5-second extension during the first cycle extends the 3′ end of the endonuclease restricted DNA strands through a 5 bp terminal complementary to the 3′end of the 1st adaptor primer, whereas the extension of the 1st adaptor primer along the majority of genomic DNA templates is not completed. Subsequent specific exponential amplification of the target is favored through the combination of touchdown, two-step and semi-nested PCR strategy and driven by primers from a known fragment such as T-DNA border sequence. This will give rise to the 5′ flanking sequence of a known fragment (Figure 1). However if nested reverse primers are used, the 3′ flanking sequence could be isolated from a known sequence. Five microlitres of the secondary PCR products are gel-checked as detailed in Materials and Methods, and if the result is positive, the remaining 15 µL PCR products are purified through Sephadex G-50 and subject to sequencing.

Figure 1
A general scheme for RSE-PCR.

2. Isolating T-DNA flanking sequence in Arabidopsis transformed with different vectors

To elucidate molecular mechanisms involved in the complex regulation of the TCH4 (TOUCH4) gene [21], one transgenic line harboring the −258 to +48 of TCH4 sequences fused to LUC in Col-0 background was mutagenized with pSuperTag2 vector to generate T-DNA insertion mutations [22], [23]. Genetic screens identified 37 mutants, which showed altered TCH4 expression (tex) after heat shock. Previous attempts with TAIL-PCR worked with only one mutant out of 37 tex mutants (unpublished data, Luis & Braam). Using RSE-PCR, sequences flanking T-DNA insertions were isolated and sequenced from 23 out of 37 tex mutants (Table 3). Figure 2 shows the representative RSE-PCR products from one tex mutant digested with four endoenzymes. The RSE-PCR product size ranged from about 300 bp to nearly 3 Kb (data not shown). All the purified RSE-PCR products sequenced with JL202 primer contained T-DNA left border sequence and genomic sequence from Arabidopsis. Flanking sequences in the remaining 14 tex mutants failed to be isolated possibly due to tandem insertion, lack of intact T-DNA border sequence, DNA rearrangement, or complicated DNA context [24], [25]. One tex mutant contains two insertions. 6 tex mutants contain insertions in exons or introns, 4 downstream of protein coding regions and 14 upstream of protein coding regions.

Table 3
T-DNA insertion sites in 23 tex mutants obtained with RSE-PCR.
Figure 2
Gel image of one representative tex mutant (tex34) after two rounds of RSE-PCR.

In addition, we found that RSE-PCR also works with other vectors commonly used in SALK and SAIL T-DNA insertion lines. xth22-A (SAIL_158_A07) and xth24-1 (SALK_005941.51.20.x) mutants insertion sites were successfully analyzed; the nested primers from the left borders of the vectors used in generating SAIL (LB1 and LB2) and SALK (BIL1 and BIL2) lines are listed in Table 1.

3. Isolation of multiple insertions in a single line

Multiple bands could be amplified after the secondary RSE-PCR, which suggests the presence of several T-DNA insertions in the line. After the gel separation of the bands, pipette tips of 1–200 µl were used to pick up a tiny piece of agarose gel directly from individual PCR bands under UV light. These were resuspended in 20 µL of autoclaved water by pippetting up and down several times. Then one microlitre was used for another round of PCR with the same primers and cycling program as in the secondary RSE-PCR. The PCR product was gel checked and purified as described above and subject to sequencing. As an example, tex87 mutants were found to contain two T-DNA insertions: one is 2982bp of 5′end of AT5G67390, and the other 562 bp downstream of At2g16290 (F-box family protein).

4. Isolating unknown sequence from a particular known gene sequence in different plant species

The above work suggests that as long as the sequence of a DNA fragment is known, the specific flanking sequence can be isolated; therefore, we next tested the feasibility of the approach in more complex plant genomes. Maize ns2 gene from B73 inbred line was specifically amplified after SacI restriction (AdSacI and ZmPFR13 for the primary RSE-PCR, and AP and ZmPFR14 for the secondary RSE-PCR). Sequencing with primer ZmPFR14 recovered 863 bp of readout, which was the same obtained previously with Genome Walker Kit from Clontech [26]. This data suggests that RSE-PCR can substitute for genome walker kit for gene cloning.

Together, the data here indicate that a new strategy, RSE-PCR, has high potential application in identifying tagged (T-DNA or transposon) sequencing or walking from known DNA toward unknown regions in large-genome plants, with likely application in other organisms as well.


Competing Interests: The authors have declared that no competing interests exist.

Funding: This work is supported by National Science Foundation grants #0313432 and MCB 0817976 to Janet Braam. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


1. Liu YG, Mitsukawa N, Oosumi T, Whittier RF. Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J. 1995;8:457–463. [PubMed]
2. Levano-Garcia J, Verjovski-Almeida S, da Silva AC. Mapping transposon insertion sites by touchdown PCR and hybrid degenerate primers. Biotechniques. 2005;38:225–9. [PubMed]
3. Pfeifer GP, Steigerwald SD, Mueller PR, Wold B, Riggs AD. Genomic sequencing and methylation analysis by ligation mediated PCR. Science. 1989;246:810–813. [PubMed]
4. Mueller PR, Wold B. In vivo footprinting of a muscle specific enhancer by ligation-mediated PCR. Science. 1989;246:780–786. [PubMed]
5. Riley J, Butler R, Ogilvie D, Finniear R, Jenner D, et al. A novel, rapid method for the isolation of terminal sequence from yeast artificial chromosome (YAC) clones. Nucleic Acids Res. 1990;19:5395–5490. [PMC free article] [PubMed]
6. Rosenthal A, Jones DSC. Genomic walking and sequencing by oligo-cassette mediated polymerase chain reaction. Nucleic Acids Research. 1990;18:3095–3096. [PMC free article] [PubMed]
7. Arnold C, Hodgson IJ. Vectorette PCR: a novel approach to genomic walking. PCR Methods Appl. 1991;1:39–42. [PubMed]
8. Garrity PA, Wold BJ. Effects of different DNA polymerase in ligation-mediated PCR: enhanced genomic sequencing and in vivo footprinting. Proc Natl Acad Sci USA. 1992;89:1021–1025. [PubMed]
9. Jones DH, Winistorfer SC. Sequence specific generation of a DNA panhandle permits PCR amplification of unknown flanking. DNA Nucleic Acids Research. 1992;20:595–600. [PMC free article] [PubMed]
10. Warshawsky D, Miller L. A rapid genomic walking technique based on ligation-mediated PCR and magnetic separation technology. Biotechniques. 1994;16:792–798. [PubMed]
11. Hui EK, Wang PC, Lo SJ. Strategies for cloning unknown cellular flanking DNA sequences from foreign integrants. Cell Mol Life Sci. 1998;54:1403–1411. [PubMed]
12. Spertini D, Beliveau C, Bellemare G. Screening of transgenic plants by amplification of unknown genomic DNA flanking T-DNA. Biotechniques. 1999;27:308–314. [PubMed]
13. Dai SM, Chen HH, Chang C, Riggs AD, Flanagan SD. Ligation-mediated PCR for quantitative in vivo footprinting. Nat Biotechnol. 2000;18:1108–1111. [PubMed]
14. Collins FS, Weissman SM. Directional cloning of DNA fragments at a large distance from an initial probe: a circularization method. Proc Natl Acad Sci USA. 1984;81:6812–6816. [PubMed]
15. Ochman H, Gerber AS, Hartl DL. Genetic applications of an inverse polymerase chain reaction. Genetics. 1988;120:621–623. [PubMed]
16. Triglia T, Peterson MG, Kemp DJ. A procedure for in vitro amplification of DNA segments that lie outside the boundaries of known sequences. Nucleic Acids Res. 1988;16:8186. [PMC free article] [PubMed]
17. Silver J, Keerikatte V. Novel use of polymerase chain reaction to amplify cellular DNA adjacent to an integrated provirus. Journal of Virology. 1989;63:1924–1928. [PMC free article] [PubMed]
18. Rich JJ, Willis DK. A single oligonucleotide can be used to rapidly isolate DNA sequences flanking a transposon Tn5 insertion by the polymerase chain reaction. Nucleic Acids Research. 1990;18:6673–6676. [PMC free article] [PubMed]
19. Dellaporta SL, Wood J, Hicks JB. A plant DNA minipreparation. Plant Molecular Biology Reporter. 1983;1:19–21.
20. Upcroft P, Healey A. PCR priming from the restriction endonuclease site 3′ extension. 1993;21:4854. Nucleic Acids Research. [PMC free article] [PubMed]
21. Xu W, Purugganan MM, Polisensky DH, Antosiewicz DM, Fry SC, et al. Arabidopsis TCH4, regulated by hormones and the environment, encodes a xyloglucan endotransglycosylase. Plant Cell. 1995;7:1555–1567. [PubMed]
22. Koiwa H, Bressan RA, Hasegawa PM. Identification of plant stress-responsive determinants in arabidopsis by large-scale forward genetic screens. J Exp Bot. 2006;57:1119–1128. [PubMed]
23. Weigel D, Ahn JH, Blazquez MA, Borevitz JO, Christensen SK, et al. Activation tagging in Arabidopsis. Plant Physiol. 2000;122:1003–1014. [PubMed]
24. Krysan PJ, Young JC, Jester PJ, Monson S, Copenhaver G, et al. Characterization of T-DNA insertion sites in Arabidopsis thaliana and the implications for saturation mutagenesis. OMICS. 2002;6:163–174. [PubMed]
25. Zambryski P. Basic processes underlying Agrobacterium-mediated T-DNA transfer to plant cells. Annu Rev Genet. 1988;22:1–30. [PubMed]
26. Nardmann J, Ji J, Werr W, Scanlon MJ. The maize duplicate genes narrow sheath1 and narrow sheath2 encode a conserved homeobox gene function in a lateral domain of shoot apical meristems. Development. 2004;131:2827–2839. [PubMed]

Articles from PLoS ONE are provided here courtesy of Public Library of Science