|Home | About | Journals | Submit | Contact Us | Français|
The jellyfish green fluorescent protein (GFP) can be inserted into the middle of another protein to produce a functional, fluorescent fusion protein. Finding permissive sites for insertion, however, can be difficult. Here we describe a transposon-based approach for rapidly creating libraries of GFP fusion proteins.
We tested our approach on the glutamate receptor subunit, GluR1, and the G protein subunit, αs. All of the in-frame GFP insertions produced a fluorescent protein, consistent with the idea that GFP will fold and form a fluorophore when inserted into virtually any domain of another protein. Some of the proteins retained their signaling function, and the random nature of the transposition process revealed permissive sites for insertion that would not have been predicted on the basis of structural or functional models of how that protein works.
This technique should greatly speed the discovery of functional fusion proteins, genetically encodable sensors, and optimized fluorescence resonance energy transfer pairs.
The discovery that the jellyfish green fluorescent protein (GFP) can form a functional fluorophore without other gene products or co-factors  was rapidly followed by reports that GFP can be used to create fluorescent fusion proteins [e.g. [2,3]]. For the first time, it became possible to create a wide variety of genetically encodable fluorescent fusion proteins that could be followed in living systems [reviewed in:]. Most GFP fusion proteins have been built by placing GFP at either the N- or C-terminus of the host protein. This can, however, destroy the function of some host proteins. The alternative is to insert GFP into the middle of the host protein [5-8]. Unfortunately, finding a permissive location for insertion of the GFP can be problematic and time consuming.
One way of speeding the process is to randomly generate libraries of GFP fusion proteins and then screen for clones that encode functional, fluorescent proteins. One group used a combination of nick translation and nuclease S1 treatment to randomly insert GFP into a cAMP-dependent protein kinase regulatory subunit from Dictyostelium. A surprisingly large number of the resulting fusion proteins were fluorescent and retained cAMP binding, demonstrating that this can be a powerful approach. A weakness of this strategy, however, is that it can produce deletions in the host sequence. Another approach is to use the random behavior of a transposon to insert GFP into many different places in a target protein. Two synthetic transposons have been reported that can produce GFP fusion proteins [9,10]. The design of these transposons included additional protein domains or linkers between the GFP and the target protein, however, and little is known about how many of the resulting proteins continued to function. We reasoned that a Tn5 transposon [11,12] could be designed that would generate GFP fusion proteins with relatively short linkers (~7 amino acids) between the GFP and the host protein analogous to GFP fusion proteins that have already been shown to function [6,8]. To test this approach, we targeted the G protein subunit αs and the glutamate receptor subunit GluR1.
Changes have been made to the Tn5 transposon, and its transposase, that in concert produce a hyperactive transposon capable of a 1% insertion frequency in an in vitro reaction [reviewed in: ]. This hyperactive Tn5 transposon is defined as any sequence flanked by the inverted 19 base pair repeats known as mosaic ends (MEs). The recombinant Tn5 transposase binds these ME sequences and, in the presence of Mg2+, catalyzes the random insertion of the transposon into target DNA in a complex process that involves generating a 9 base pair staggered nick in the target. This staggered nick is subsequently repaired to produce a 9 base pair duplication of the target sequence that flanks the inserted transposon. Two possible reading frames extend through the MEs of the Tn5 transposon. Our initial GFP transposon, <EGFP-V>, was created by placing the sequence encoding enhanced green fluorescent protein (EGFP) in one of these frames such that if the transposon landed in another coding sequence, in the correct orientation and frame, it would produce a GFP fusion protein (figure (figure1A).1A). The low probability of transposition in an in vitro reaction made it necessary to include antibiotic resistance, so Kanr was added to the transposon flanked by Srf I restriction sites that can be used to subsequently remove it.
An epitope tagged version of the G-protein subunit αs (αsEE) was chosen as the first target (figure (figure1B).1B). Previous studies have shown that the N- and C-termini of αs are important for its interactions with receptors, G-protein β and γ subunits, and the plasma membrane [13,14], so placing GFP within internal regions of αs is more likely to generate a functional, fluorescent subunit . Moreover, the structure of αs has been solved , making it possible to interpret the results in the context of the three-dimensional structure. After transposition and transformation, colonies expressing dual antibiotic resistance were screened with PCR to identify clones in which <EGFP-V> had landed in the correct orientation within the coding region (figure (figure1C).1C). Assuming that Tn5 behavior is random, the probability that <EGFP-V> will land in the αsEE coding sequence during transposition should be the ratio of the coding sequence to the size of the total plasmid (18.5%). However, transpositions that disrupt critical elements of the plasmid (the plasmid origin or the Ampr gene) should not be recovered after transformation, so the predicted probability of observed transpositions within the αsEE coding sequence increases to 23.8%, with half of these (11.9%) being in the correct orientation. PCR screening of 384 Ampr + Kanr resistant colonies identified 44 clones with <EGFP-V> insertions within the αsEE coding region in the correct orientation (11.4%).
Each clone containing an in-frame insertion should encode a truncated αs protein with GFP at the carboxy-terminus due to a stop codon in the Kanr. Thirty-five of the PCR-positive clones were transiently expressed in HEK 293 cells, and 13 were fluorescent. Sequencing confirmed that the 13 fluorescent constructs were truncated αs-GFP fusion proteins (with 12 being unique insertions) and that the remaining 22 <EGFP-V> insertions were out of frame (figure (figure2A).2A). The 12 clones encoding unique αs-GFP fusion proteins were digested with Srf I and re-ligated to create full-length fusion proteins (figure (figure2B).2B). Transient expression of each of the 12 αs-GFP fusion proteins in HEK 293 cells produced a fluorescent signal. This is surprising because several insertions appear to be in internal and/or rigid secondary protein structures (figure (figure3).3). It appears that the folding of GFP to form a fluorophore is thermodynamically favorable at most insertion sites.
To determine what effect the GFP insertions had on αs localization, the full-length fusion proteins were transiently co-expressed in HEK 293 cells with G protein subunits β1 and γ7, which have been shown to mediate signaling between the β-adrenergic receptor and Gs. Amino- and carboxy-terminus GFP fusions, αs-GFP(N) and αs-GFP(C), respectively, were also co-expressed with β1 and γ7 for comparison. The end-labeled GFP fusions and two of the transposon insertions, αs-GFP(18–20) and αs-GFP(92–94), showed clear localization to the plasma membrane (figure (figure4A).4A). The remaining 10 fusion proteins displayed a uniform fluorescence signal throughout the cytoplasm (figure (figure4B4B).
The fusion proteins were tested for function by assaying their abilities to stimulate adenylyl cyclase in response to receptor stimulation. They were co-expressed with the luteinizing hormone (LH) receptor in HEK 293 cells and cAMP accumulation was measured in both the presence and absence of the LH receptor agonist, human chorionic gonadotropin (hCG). Basal and stimulated cAMP accumulation in cells expressing αs-GFP(18–20), αs-GFP(92–94), αs-GFP(N), or αs-GFP(C) were higher than in cells expressing vector alone (figure (figure5A).5A). However, only in cells expressing αs-GFP(92–94) were these differences statistically significant (p < 0.05). The basal and stimulated activities of αs-GFP(92–94) were less than those of αs, although these differences were not statistically significant (p < 0.05). The remaining 10 of the12 fusion proteins exhibited no detectable activity. One possible explanation for the decreased activities of the αs-GFP fusion proteins relative to αsEE would be a decrease in protein expression level. Cell fractionation and immunoblotting with an anti-EE monoclonal antibody showed that both αs-GFP(92–94) and αs-GFP(18–20) were expressed at lower levels than αsEE, in contrast to αs-GFP(N) and αs-GFP(C) (figure (figure5B5B).
Interpreting these results in the context of the structure of αs leads to a surprising result. A rational approach to designing a fluorescent, functional αs-GFP fusion protein would have most likely targeted the exposed loops [e.g. ], yet these insertions were not functional. The most functional protein was produced by the insertion of GFP into an α-helix that one would have avoided (figure (figure33).
The discovery that all of the in-frame insertions in αs produced truncated fluorescent fusion proteins suggested that we could identify in frame insertions by transiently expressing all of the transposed clones and visually screening them for fluorescence. This alternative screening strategy could be particularly useful for large coding regions where a PCR-based screen might fail. To reduce the number of transient transfections required, a second transposon was created with enhanced cyan fluorescent protein (ECFP). Two separate transpositions with the different colored transposons, followed by co-transfections in the visual screen (one potential green clone and one potential cyan clone per well), can identify twice as many in-frame insertions in a given number of transfections. This approach could be expanded to encompass many different fluorophores.
In the experiments with αs, several clones were recovered with identical transposon insertions. This is consistent with previous reports of Tn5 preferentially inserting into particular locations in the target sequence [17,18]. Since these "hotspots" could become a limiting factor in the number of unique insertions recovered within a target sequence, the second reading frame through the Tn5 MEs was used for the ECFP transposon. This doubles the number of potentially useful insertion sites within a given target sequence.
The glutamate receptor subunit GluR1  was used to test the new transposons and the visual screening process. Independent transpositions of the GluR1 plasmid were performed with the EGFP and ECFP transposons (<TgPT-0> and <TcPT-1>, respectively). In 288 co-transfections, there were 20 wells with EGFP fluorescence, 21 wells with ECFP fluorescence, and 2 wells with both EGFP and ECFP fluorescence. Sequencing revealed 35 unique insertions (17 <TgPT-0> and 18 <TcPT-1>) and 10 repetitive insertions (figure (figure6A).6A). The recovery of 45 fluorescent clones from 576 colonies (7.8%) agrees with the predicted frequency of transpositions resulting in GluR1-EGFP/ECFP fusions (7.7%), which is consistent with the interpretation that all in-frame insertions produce a fluorescent protein. Clones representing unique fluorescent fusion proteins were digested with Srf I to remove the Kanr selection cassette and re-ligated to generate full-length GluR1-EGFP/ECFP fusions. These fusion proteins were screened, in transiently transfected HEK 293 cells, for glutamate-gated ion channel function. Of the 29 unique tribrid fusion constructs tested, all produce detectable fluorescence and 6 were functional (figure (figure6B6B).
Creating functional, fluorescent fusion proteins involves finding a permissive site for the insertion of GFP, a process that in most cases still involves some guesswork. The results of both the αs and GluR1 transpositions illustrate this point. Based on previous studies with the G protein subunit αq  we anticipated that an insertion within an exterior flexible loop region of αs would be most likely to produce a functional fusion protein. Surprisingly, the αs fusion protein that was the most functional, αs-GFP (92–94), resulted from an insertion into an α helix (figure (figure3),3), while the insertions in exposed loops, αs-GFP (67–69) and αs-GFP (188–190), were not functional. Similarly, in the case of GluR1, one of the insertions that produced a functional channel, GluR1-GFP(526–528), was within the hydrophobic region thought to be the first transmembrane domain (see Additional File: Figure 7). Additionally, within a given region of GluR1, one insertion will produce a functional channel while another nearby insertion does not (e.g. the intracellular carboxy-terminus region or the amino terminus between amino acids 210 and 330). The reasons for these discrepancies are not obvious.
The discovery that GFP will still fold and form a fluorophore when placed virtually anywhere in another coding region suggests that the limiting step in the process is whether the target protein it is inserted into folds and functions correctly. Indeed, GFP fusion constructs have been used to assay, and improve upon, the folding of a variety of proteins in a bacterial expression system . The relatively random nature of the transposition events we recovered in this study suggests that it might be possible to insert GFP at nearly every position in a given protein, but there are two potential limits. First, the laws of probability predict that there will be rapidly diminishing returns in the search for unique Tn5 insertions as one recovers each additional clone. Second, the behavior of the Tn5 transposon is not entirely random. Goryshin and colleagues  have shown that there is a weak consensus site for Tn5 insertion which is consistent with our results. It appears that the resolution limit will be an insertion each three amino acids on average in a target protein.
Inserting a reporter domain such as GFP into another protein always has the potential of perturbing the target and destroying it's ability to function. In this study 16% of the tribrid fusion proteins were still functional. One explanation for why GFP can be used for internal insertion is that the N- and C-termini of GFP exit the structure quite close to one another and are unlikely to displace the surrounding domains of the target protein a great deal. This is analogous to the use of the bovine pancreatic trypsin inhibitor for internal insertions . The transposons described here could potentially be improved upon by optimizing the length and flexibility of the linkers between the target and the GFP. Another potential improvement to the process would be to use bacterial expression to screen for transposon insertions that produce a fluorescent protein. This could, however, be problematic with proteins from the mammalian nervous sytem, such as ion channels, that are difficult to express in bacteria.
The approach described here should speed the discovery of genetically encodable fluorescent sensors. The pioneering work of Siegel and Isacoff showed that GFP placed within a portion of the Shaker K+ channel C-terminus produced a fluorophore that responded to changes in membrane voltage , but they built a number of different constructs before finding one that worked. Similarly, Ataka and Pieribone created an EGFP-Na+ channel fusion protein that changes fluorescence in response to membrane depolarizations on a time-scale that would be sufficent to image action potentials. This discovery, however, was the result of designing, building, and testing eight different tribrid fusion proteins . Little is known about the mechanism whereby changes in channel conformation are converted to changes in the fluorophore, so it remains to be determined whether GFP can signal conformational changes in other kinds of proteins. Nevertheless, the use of the transposons described here should shift the work from building the constructs to devising high throughput assays for function.
Finally, random GFP tagging will facilitate the creation of potential fluorescence resonance energy transfer (FRET) reagents to study protein interactions in living systems. To date, a few studies have demonstrated the potential power of GFP-FRET by labeling different proteins [6,23-26] or by fusing two different fluorophores to the same protein [27-32]. Creating efficent donor and acceptor fusion proteins is difficult, however, because FRET only occurs when the two fluorophores are attached to surfaces that are very close to one another. The approach described here makes it possible to rapidly generate libraries of potential donor and acceptor tribrid fusion proteins that can be screened, in pairwise combinations, for function and FRET signals.
The transposons described here make it possible to rapidly generate large numbers of different GFP fusion proteins. The results show that GFP can be inserted into a wide variety of other protein domains and it will continue to fold and form a fluorophore. The rapid and random nature of the transposition process makes it possible to generate and screen many different fusion constructs to identify those that continue to function. In the case of the two proteins tested here, roughly 1 in 6 of the fusion proteins retained their signaling function, and the random nature of the transposition process revealed permissive sites for insertion that would not have been predicted on the basis of structural or functional models of how that protein works. This simple tool should speed the search for a wide variety of new biological probes for the study of nervous system.
PCR and standard subcloning procedures were use to create the initial transposon, <EGFP-V> (full sequence at: http://momotion.med.yale.edu). The Tn5 MEs were added to the 5' and 3' ends of an EGFP coding sequence, with a Srf I restriction site at its 3' end, such that one continuous reading frame extended through both MEs and EGFP (figure (figure2).2). To add antibiotic selection, the Kanr gene from pUniV5-His-TOPO™ (Invitrogen, Carlsbad, CA) was flanked with Srf I sites and inserted into the transposon. The improved transposons, <TgPT-0> and <TcPT-1>, were created in the same way as <EGFP-V>, but Asc I sites were added to facilitate changing the fluorescent protein at a later date (supplemental material). In addition, the two different reading frames present in the MEs were used to create the two different transposons, and ECFP was used in place of EGFP in <TcPT-1>. A primer complementary to the19 bp Tn5 ME (5'-CTGTCTCTTATACACATCT-3') was used to amplify the transposons (1 cycle at 95°C for 3:30 min., 24 cycles of 95°C for 30 sec 47°C for 30 sec 72°C for 1 min., 1 cycle at 72°C for 5 min.) with Pfu polymerase (Stratagene, La Jolla, CA). The PCR product was purified and concentrated with the Geneclean II kit (Bio101 Inc., Vista, CA) and eluted in 1X TE buffer. 0.2 fmoles of transposon were incubated with 5.0 μL of EZ::TN™ transposase (Epicentre Technologies) in 25% glycerol at 25°C for 30 min.
Molar equivalents of transposon and target plasmid (0.4 fmoles ea.) were incubated in reaction buffer (50 mM Tris-acetate (pH 7.5), 150 mM potassium acetate, 10 mM magnesium acetate and 4 mM spermidine) at 37°C for 2 hr in a 10 μL reaction. Transposition was stopped by adding 1 μL of 1% SDS and incubating at 70°C for 10 min. Top 10 F' E. coli (Stratagene) were transformed with 1 μL of the transposition reaction and plated on LB agar with either ampicillin (100 μg/mL) and kanamycin (50 μg/mL) to recover transposed clones, or ampicillin (100 μg/mL) alone to establish the transposition efficiency.
The cDNA encoding the rat αs , modified to carry the EE epitope , was in pcDNA1/Amp (Invitrogen). GFP was added to the N- or C-terminus of αs to create end-labeled constructs for comparison with the transposed GFP tribrid fusion proteins. The amino-labeled clone, GFP-[GGGPSGGGGS]-αsEE, and carboxy-labeled clone, αsEE-[SGGGGSGQH]-GFP, were generated via overlap extension . Linker sequences are in brackets. The flip variant of rat GluR1 was in the CMV expression plasmid pRK5 (a generous gift from Derek Bowie, Emory University, Atlanta, GA).
PCR screening for <EGFP-V> insertions within the αsEE coding region was performed using a protocol described by Cease et al.  using an upper primer complimentary to the 5' UTR (5'-GCTCCCGCGGCTCCTGCTCTGCTC-3'), and a lower primer complimentary to EGFP (5'-GCCGTCGCCGATGGGGGTGTTCTG-3'. The clones that produced clear PCR products within the expected size range were then miniprepped (QIAgen, Germantown, MD).
Insertion sites were identified for all PCR-positive <EGFP-V> transposed clones and all fluorescent <TgPT-0>/<TcPT-1> transposed clones by sequencing out of the transposon with a primer complimentary to the EGFP/ECFP coding region (5'-tggccgtttacgtcgccgtcca-3'). Srf I restriction digestion was then used to remove theKanr cassette from the clones carrying in-frame insertions, thereby creating a sequence encoding a full-length fusion protein. After digestion and re-ligation, Top 10 F' E. coli were transformed with 1 μL of the ligation reaction and plated on LB agar containing ampicillin. The colonies were re-plated the following day on ampicillin and kanamycin to verify loss of theKanr.
The fusion proteins were transiently expressed in HEK 293 cells . Transfections were done using Lipofectamine 2000 (Gibco BRL). Images were collected from live cells 20–48 hr later on an inverted Zeiss microscope fitted with computer controlled (IPLabs, Scanalytics) filter wheels (Ludl Electronics) on the excitation and emission paths. EGFP was imaged with an FITC filter set, while ECFP was distinguished from EGFP in co-expression experiments by changing both the excitation and emission filter sets (Exciters: 440AF21 & 500AF25, Dichroic cat# XF 2063, Emitters 480AF & 545AF35; Omega, Brattleboro, VT).
αs-GFP fusion proteins were assayed for the ability to stimulate adenylyl cyclase in response to luteinizing hormone (LH) receptor stimulation . 106 HEK 293 cells/60 mm-dish were co-transfected with 2 μg of plasmid DNA encoding the αs-GFP fusion protein, and 0.2 μg of plasmid DNA encoding the rat LH receptor in pCIS , using 10 μL of Lipofectamine 2000. [3H]-adenine-labeled cells were assayed for cAMP accumulation after incubation at 37°C for 40 min. in the presence of 1 mM 3-isobutyl-1-methylxanthine (IBMX) a phosphodiesterase inhibitor, and in the presence or absence of 7.5 ng/mL human chorionic gonadotropin (hCG), as described previously. Conversion of ATP to cAMP was expressed as:
103 × [3H]cAMP/([3H]ATP + [3H]cAMP).
12 × 106 HEK 293 cells were transfected, using DEAE-dextran , with 25 μg of plasmid DNA. Forty-eight hours after transfection, cells were lysed and membrane and supernatant fractions harvested as described previously . 10 μg of membrane proteins and normalized volumes of the supernatants were resolved by SDS-polyacrylamide electrophoresis (10%), transferred to nitrocellulose, and probed with a monoclonal antibody to the EE epitope . The antigen-antibody complexes were visualized with ECL chemiluminescence (Amersham Biosciences, Piscataway, NJ).
Whole-cell patch clamp recording was used to test the GluR1 fusion proteins for function in transiently transfected HEK 293 cells as previously described . The external solution was (in mM): 150 NaCl, 3 KCl, 2 CaCl2, 1 MgCl2, 5 glucose, 0.002 glycine and 10 HEPES (pH 7.4). Patch pipettes were filled with a solution containing (in mM): 120 CsF, 33 KOH, 2 MgCl2, 1 CaCl2, 0.1 spermine, 10 HEPES, and 11 EGTA (pH 7.4). Cyclothiazide was prepared as a 20 mM stock solution in DMSO and diluted to 100 μM in external solution. All chemicals were purchased from Sigma. Drugs were applied with a rapid superfusion system made from a pulled theta capillary. The open tip responses obtained with this system had 10–90% rise-times of 150 μs to 300 μs.
Author 1 D. L. Sheridan carried out the design and construction of the transposons, conducted the biochemical assays for G-protein signaling, imaged the living cells, and drafted the manuscript. Author 2 C. H. Berlot provided critical reagents and advice for all portions of the G-protein work. Author 3 A. Robert screened the Glutamate receptor subunits for function. Authors 4 F. M. Inglis and 5 K. B. Jakobsdottir provided help in the minipreparation of plasmid DNA for each of the constructs. Authors 6 J. R. Howe and 7 T. E. Hughes participated in the study design, coordination, and analysis.
All authors read and approved the final manuscript.
We thank Tom Hynes for creating figure figure3,3, the members of the Friday Afternoon Lab Meeting for their input, Janet Robishaw for the human β1 in pCMV5 and HA-tagged γ7 in pCI-neo plasmids, Derek Bowie for the GluR1flip in pRK5 plasmid, Jim Boulter for his suggestions, and Michael Hollmann for permission to adapt his GluR1 topology figure. This work was supported by: NIH RO1 EY 08362 (to T.E.H.), NIH RO1 GM 50369 (to C.H.B.) and NIH RO1 NS 37904 (to J.R.H.). D. L. Sheridan is an HHMI Predoctoral Fellow.