|Home | About | Journals | Submit | Contact Us | Français|
Many GFP variants have been developed for use as fluorescent tags, and recently a superfolder GFP (sfGFP) has been developed as a robust folding reporter. This new variant shows increased stability and improved folding kinetics, as well as 100% recovery of native protein after denaturation. Here, we characterize sfGFP, and find that this variant exhibits hysteresis as unfolding and refolding equilibrium titration curves are non-coincident even after equilibration for more than eight half-lives as estimated from kinetic unfolding and refolding studies. This hysteresis is attributed to trapping in a native-like intermediate state. Mutational studies directed towards inhibiting chromophore formation indicate that the novel backbone cyclization is responsible for the hysteresis observed in equilibrium titrations of sfGFP. Slow equilibration and the presence of intermediates imply a rough landscape. However, de novo folding in the absence of the chromophore is dominated by a smoother energy landscape than that sampled during unfolding and refolding of the post-translationally modified polypeptide.
Green Fluorescent Protein (GFP) has become a common label for many in vivo and in vitro applications due to GFP’s ability to fold and form a visually fluorescent chromophore through autocatalytic cyclization and dehydration/oxidation reactions1. GFP has been used as a reporter in folding2; 3, protein-protein interactions4; 5; 6; 7, and gene translation8; 9; 10. However, the rate of GFP folding and chromophore formation limits the temporal resolution of these techniques, and their common use in high-throughput applications11.
Studies of GFP have shown that folding is slow and that oxidation of the chromophore is the limiting step in fluorescence maturation12. As the refolding of mature GFP is not fully reversible and is prone to aggregation13, variants of GFP have been developed with better folding and fluorescence properties- namely Cycle3 GFP (F99S M153T V163A) 14, and GFPmut2 (S65A V68L S72A) 15. Folding improvements in Cycle3 GFP are attributed to avoiding aggregation traps in folding13; still, folding is very slow (t½ = 4.5 min) and limited by a proline isomerization13; 16. Single molecule and solution studies of GFPmut2 show a lower stability and slower kinetics than Cycle317; 18, probably as it was optimized for in vivo fluorescent use rather than for stability and folding kinetics. While Cycle3 GFP has improved folding, the recovery of fluorescence after guanidine, urea, or acid denaturation is still not fully reversible, with a reported 80% rescue of native protein13. Thus, improving refolding efficiency to 100% would create a better reporter for a variety of applications3; 19; 20.
Recently, a “superfolder” GFP (sfGFP) variant has been developed to minimize aggregation and speed up folding. This is accomplished by fusing it to the poorly folding protein ferratin, which also improves ferratin folding and recovery21. The sfGFP includes the Cycle3 mutations F99S, M153T, and V163A, the enhanced GFP mutations F64L and S65T, as well as six other mutations realized through directed evolution (S30R, Y39N, N105T, Y145F, I171V, A206V). This variant also contains the original Q80R mutation from PCR cloning22. This variant exhibits 100% fluorescence recovery after refolding from the urea denatured state, as well as faster refolding kinetics compared to cycle3 GFP when compared at a single denaturant concentration. The robustness of sfGFP folding is attributed to reduced misfolding and aggregation. These results are consistent with the design strategy of sfGFP, generating variants by selecting bright fluorescent colonies while expressing an aggregation-prone sfGFP-ferratin fusion protein21. Much of the folding improvement is attributed to the S30R mutation, which extends an internal ion-pair network within the interior of sfGFP21.
In this study, we characterized the full folding landscape of sfGFP using stopped flow and manual-mixing fluorescence kinetics, as well as fluorescent and circular dichroism (CD) equilibrium techniques. We find that sfGFP folds ten-fold faster than cycle3 GFP over a range of denaturant concentrations, and exhibits apparent hysteresis in equilibrium unfolding and refolding experiments, suggestive of a non-optimized energy landscape for this protein. However, while the funneled organization of the energy landscape dominates the kinetics of folding for highly evolved and/or well designed systems, GFP may be a special case because of its unusual chromophore. That is, the de novo folding pathway of GFP proceeds prior to chromophore formation while the refolding of GFP proceeds in the presence of the already cyclized chromophore. This landscape is not subject to the same evolutionary pressure as is folding in vivo. Consistent with this hypothesis, the apparent hysteresis is ameliorated in the presence of mutations that impede chromophore formation. One-dimensional 1H NMR spectra show that sfGFP trapped in refolding has the same overall topology as sfGFP, while displaying subtle structural perturbations in the novel central helix containing the backbone cyclized chromophore. Taken together, our data provide the first explanation for the long-held observation of the inability to observe true equilibration in GFP folding based on unique structural characteristics of GFP, compared to other proteins.
The sfGFP was developed for robust folding by directed evolution and contains twelve mutations with respect to the wild-type protein21. These mutations include the mutation Q80R from PCR cloning, Cycle3 mutations F99S, M153T, and V163A, the enhanced GFP mutations F64L and S65T, as well as six other mutations (S30R, Y39N, N105T, Y145F, I171V, A206V), as reported previously21. In the current study we characterize the folding landscape of sfGFP.
We used a combination of techniques to characterize the folding of sfGFP. The characteristic optical spectra of the native and denatured sfGFP are shown in Figure 1. The fluorescence emission spectrum obtained upon excitation of the chromophore at 450 nm is similar to wtGFP (Figure 1a), with a peak at 508 nm, and a red-shifted, small shoulder arising from the protonated chromophore 1. Protein unfolding is accompanied by a complete loss of fluorescence emission at 508 nm. The fluorescence emission spectra of a second probe, the single tryptophan (Y74) of sfGFP, upon excitation at 295 nm for the native and denatured states are given in Figure 1b. The native emission spectrum has a fluorescence maximum at 328 nm. This is consistent with sequestration of the tryptophan side chain from solvent and packing into the protein core. The emission spectrum of the denatured protein is red shifted relative to that observed for the native protein, with a fluorescence maximum at 355 nm, and is consistent with the exposure of the tryptophan side-chain to aqueous solution upon unfolding23. A third spectroscopic probe, the far-UV circular dichroism (CD) spectrum of native sfGFP (Figure 1c), exhibits a minimum at 217 nm with a shoulder at 222 nm, consistent with the small fraction of α-helix within the large β-barrel of sfGFP. Denaturation results in a loss of the characteristic CD spectrum of the native protein.
Previous equilibrium unfolding of wild-type and mutant GFPs indicated a high stability of the native protein, but problems with protein aggregation resulted in a maximal recovery of ≈80% of native signal during refolding21. Preliminary studies with sfGFP indicated that refolding and unfolding are fully reversible as 100% of native signal is recovered under strongly refolding conditions. Curves of the fraction of unfolded protein (Funfolded) as a function of final denaturant concentration for the unfolding transition after 24, 96, and 192 hours of equilibration are shown in Figure 2 as detected by changes in chromophore fluorescence as a function of final denaturant concentration. The unfolding transition is unchanged between 96 and 192 hours, indicating that an “equilibrium” has been achieved.
In order to confirm the reversibility of the folding transition, we performed a refolding equilibrium analysis and the results of this study are plotted with the equilibrium unfolding data in Figure 3a. The unfolding transition fits to a three-state model; a two-state model (shown, Figure 3a, cyan, dotted line) was inadequate to fit the data. However, the refolding transition fits to a two-state model. Interestingly, while 100% of the fluorescent amplitude and CD are recovered under strongly refolding conditions (<= 1.7M Guanidine Hydrochloride), the unfolding and refolding transition curves are non-coincident (see Discussion). These results are consistent with observations from prior urea equilibrium studies, as the equilibrium refolding transition was observed at approximately 4.5 M urea at 25°C only after denaturation in 9M urea at 95°C21.
In an effort to test whether an irreversible process occurred during unfolding to which our probes were insensitive, we performed additional studies to check for non-coincidence of the unfolding and refolding transitions. First, we varied the protein concentration over an order of magnitude, and found identical behavior (data not shown) suggesting that higher order association was not responsible for this result. Second, we performed equilibrium denaturation on protein that had previously been subjected to the unfolding-refolding cycle. The unfolding transition curve for the unfolded-refolded protein is identical to the original unfolding curve obtained for the native sfGFP (which had never been previously subjected to unfolding data not shown). Third, we monitored the unfolding and refolding equilibrium transitions by tryptophan fluorescence as well as far-UV CD. The results of these studies are identical to the transitions observed by chromophore fluorescence, showing that the results are independent of the spectroscopic method used. Fourth, we compared the unfolding of protein over time which had previously been subjected to the unfolding-refolding cycle to that which had never been unfolded. The decay in fluorescent amplitude for either the native or refolded sfGFP exhibits the same rate and amplitude in their kinetic behavior. Therefore, there is no loss of stability in the refolded protein (Figure 3b). Taken together, these data indicate that the apparent hysteresis between equilibrium unfolding and refolding transitions is an intrinsic property of the system, and is independent of experimental setup (see Discussion).
The kinetic rates of sfGFP protein unfolding were monitored by both manual-mixing and stopped-flow fluorescence techniques. Unfolding was initiated by rapid dilution of native protein into varying final Gdn-HCl concentrations and the unfolding rates (τui) were determined from fitting the time-dependent change in the fluorescence intensity (see Methods). The observed relaxation times, τui, are plotted as a function of final denaturant concentration and the resulting chevron plot24 is presented in Figure 4. Manual mixing and stopped flow results exhibit two relaxation phases at 7.0M Gdn-HCl and higher, and three phases at 6.5M Gdn-HCl and lower. The τu3 phase shows weak dependence on the final denaturant concentration, and the τu2 relaxation time is independent of the final Gdn-HCl concentration between 5 and 6.5 M Gdn-HCl(Figure 3). The τu1 phase shows the strong denaturant dependence expected for a refolding reaction. Above 6.5 M Gdn-HCl, τu2 and τu1 coalesce. Extrapolation of the unfolding denaturant dependence of the τu2 phase to 0M Gdn-HCl indicates a half life of 28 years for unfolding at room temperature.
The refolding of sfGFP was followed by manual mixing and stopped-flow fluorescence techniques. The time-dependent change within the first 200 seconds in the fluorescence intensity for a typical refolding reaction is presented in Figure 5. The first 400ms of the reaction (Figure 5, inset) shows a lag phase, which is evidence of an on-route, obligatory intermediate25; 26, which has been reported for cycle3 GFP previously16; 27. Three relaxation phases adequately describe the refolding process (see Figure 4) and, unlike the unfolding kinetics, each phase shows a strong denaturant dependence. The kinetic rates were at least an order of magnitude faster than observed for wtGFP and cycle3 GFP13. The time constants for the lag phase were estimated from the length of the lag phase versus guanidine, and are shown as filled triangles in Figure 4. The symbols are different to represent the estimation of the time constants, as opposed to fitting from direct kinetic data. The increase in the persistence of the lag phase may be attributed to the stabilization of the intermediate compared to the native state. Interestingly, no observable folding kinetics could be recorded between 1.75 and 5M Gdn-HCl for either the unfolding or refolding reactions by any techniques. The denaturant dependence of the unfolding reactions and its associated amplitudes would suggest that neither kinetic rates nor amplitude concerns should account for the inability to measure unfolding between 4.5 and 5M Gdn-HCl (See Discussion). Extrapolation of the denaturant dependence of the refolding relaxation times to 0M denaturant indicates half-lives of 390ms for τR1, 79ms for τR2, and 1.1ms for τR3. Corresponding time constants (directly from the Figure 4) are 520 ms, 140 ms, and 16 ms, respectively. Fitting of the overall apparent rate from the known folding and unfolding rates (kapp = ku+kr) gives a maximum half-life of 245 hours, or 10.2 days. However, as described above, there is no data in the hysteresis zone, and these numbers are an approximation based on known knowns. The unique β-barrel surrounding an α-helix fold of GFP has extremely high values of absolute contact order (29.5) and 214 sequence-distant pairs, leading to slow folding kinetics28.
The refolding of sfGFP shows three relaxation phases, all of which are Gdn-HCl dependent (Figure 4). Multiple phases are evidence of proline-isomerization limiting folding29. Unfolding relaxation kinetics has a phase (τu2), which has no guanidine dependence; potential evidence of a proline isomerization becoming a rate-limiting step in unfolding at intermediate guanidine concentrations. The presence of cis and trans prolyl-peptide bonds in the native state that give different isoforms with differing stabilities would also give rise to these results 29. According to the original proline isomerization hypothesis, the protein must have each of its proline imide bonds in the correct native state isomerization before refolding. Protein molecules which contain incorrect isomerization states must wait for the correct imide bond state before refolding. More recent studies have shown that incorrect isomers may fold, and the coupling of folding and isomerization will yield different rates for the folding process30; 31; 32.
In order to test for the role of proline isomerization in the folding kinetics of sfGFP, a series of “double jump” or interrupted refolding experiments were carried out using manual mixing. This experiment involves quickly unfolding the protein (relative to proline isomerization) at high denaturant concentration. After a variable delay time, t, we dilute out the denaturant and measure the total intensity change and associated rates for refolding to the native state. If proline isomerization is occurring, there will be a progressive change in the amplitude of the observed refolding phases as a function of the delay time prior to refolding. The experiment involves a double jump of 0M to 7.4M to 0.22M Gdn-HCl, with the time spent in 7.4M recorded as the delay time. The fraction of the total amplitude as a function of the delay time is given in Figure 6. The data indicates a shifting population of the fast pathway (τ2) to the slow pathway (τ1) over the aging time. This is consistent with a proline isomerization occurring after unfolding.
The amplitude shift of refolding phases during interrupted refolding experiments is consistent with proline isomerization limiting the folding reaction. However, any spectrosopically hidden phase occurring within the unfolded ensemble with slow kinetics will give these results. To further test if proline isomerization limits folding, we repeated the double jump experiments in the presence of Cyclophilin A (CycPA), a protein with peptidyl-proline isomerase activity. In this experiment, we again perform the double jump from 0M to 7.4M to 0.22M Gdn-HCl with time spent in 7.4M Gdn-HCl recorded as the aging time. However, the refolding reaction is performed by dilution of the 7.4M Gdn-HCl sfGFP solution into buffer containing CycPA, with a final concentration of 0.22M Gdn-HCl after a 600s aging time. The presence of CycPA increases the population folding via the fast (τ2) route by 10% (Figure 6). The shifting of the phases from slow (τ1) to fast (τ2) over the maturation time, and the recovery of the fast phase in the presence of Cyclophilin A provides evidence of a proline isomerization causing multiple phases in folding and unfolding. Although CycPA doesn’t give a full recovery of folding to the fast phase, refolding is being performed at 0.22M Gdn-HCl, which lowers the activity of CycPA by destabilization.
Interestingly, the guanidine-HCl concentrations at which no detectable kinetics in the unfolding and refolding reaction are observed (see Figure 4) corresponds to the region of difference in the unfolding and refolding curves presented in Figure 3a. In an effort to test if the inability to measure kinetics in the center of the chevron is a result of a change in mechanism or rate determining step in the unfolding and refolding reactions, we performed a series of kinetic experiments. These experiments are similar to double jump experiments with an infinite aging time. In these experiments we varied the initial conditions by equilibrating samples within the transition regions of the denaturation (D) and renaturation (R) curves (Figure 7-and Methods and Materials) prior to kinetic analysis. We then monitored the kinetics of a jump from the initial conditions within the transition zone to final conditions within the hysteresis zone (Figure 7).
If the protein follows two separate routes in unfolding and refolding, the expectation is that varying the initial conditions will allow us to probe the kinetics within the transition region. The observed rates will allow us to probe the guanidine dependence within the center of the chevron plot (Figure 4) and the endpoint of the reaction will track along the hysteresis curve backwards; i.e. the refolding transition will follow from the center of the curve, along the modeled line to a fully unfolded state within the hysteresis area. Conversely, the unfolding transition will follow from the center of the curve to the fully refolded state (Figure 7a). However, if the hysteresis is caused by trapping of the protein in a broad, flat barrier region of the folding landscape at intermediate denaturant concentrations, there will be no relaxation, as this represents a highly glassy state. The resultant fluorescence intensity will not change as we probe the center of the glassy hysteresis zone; i.e. the result will appear to be a minor hysteresis curve, as seen in magnetic systems33 (Figure 7b). The results of our studies are plotted in Figure 7c as black filled circles and squares for refolding and unfolding reactions, respectively. The data are consistent and resemble a minor hysteresis curve, with no observable kinetics (Figure 7c). This data provides evidence of a glassy transition or trapped state in the folding landscape resulting in the observed hysteretic phenomenon.
Interestingly, we observe significant shifts in the unfolding and refolding reactions between 24 and 96 hours, however, we see no change between 96 and 192 hours (Figure 2). However, the extrapolated kinetic results suggest that the equilibrium curves should coincide after five half-lives, or 2 months. In Figure 8, we present the results of monitoring the equilibrium reactions after 58 and 102 days. Importantly, the refolding reaction shows no change in the transition curve observed between 96 hours and 106 days. Conversely, while the unfolding transition curve shows no change between 96 and 192 hours, a shift in the transition curve to lower denaturant concentrations occurs after 58 and 102 days, similar to results seen by Sophie Jackson34; 35. Despite the shift to lower denaturant concentrations, the observed transition curve retains three-state behavior and does not approach the observed refolding curve. If folding was just extremely slow, and insufficient equilibration times were the explanation for the observed hysteresis, the refolding curve would have also shown change over time. Our results indicate no change over time for refolding, and demonstrate that a separate process controls unfolding at intermediate guanidine concentrations.
Solution studies have shown that the GFP chromophore not only can undergo isomerization, but also is held in a strained configuration by the protein core that surrounds the central helix36. GFP folds into a native barrel structure prior to chromophore formation, and the α-helix backbone kinking which accompanies chromophore formation requires that the protein be folded into the barrel structure. In an effort to understand if the chromophore of sfGFP contributed to the observed hysteresis, we wished to compare the folding of the sfGFP protein which had yet to undergo chromophore formation. Previously, the R96A GFP mutant was discovered to slow chromophore formation from minutes, to months37. We mutated this residue in sfGFP and monitored the equilibrium unfolding and refolding of the protein by the change in tryptophan fluorescence as a function of added denaturant concentration (Figure 9a). The equilibrium unfolding and refolding transitions of the R96A sfGFP are superimposable, and show no evidence of hysteresis.
We have shown that like cycle3 GFP, proline isomerization slows the folding kinetics of sfGFP. Scanning mutagenesis in our laboratory revealed that mutation of the N-terminal residues in two X-Pro peptide sequences Y74P75 and M88P89 to Y74M/M88Y also hindered chromophore formation. The equilibrium refolding and unfolding transitions for this variant also removed the apparent hysteresis (Figure 9b). These data support the hypothesis that the hysteresis seen in folding is coupled to the conformation of the novel chromophore. Interestingly, the cooperativity of the folding/unfolding transitions for the double mutant has decreased from that observed for R96A. These two mutations map to the helical cap of the barrel which is tightly pinned to the barrel in the sfGFP(see Discussion). Destabilization of cap/barrel interactions may alter the solvent accessible surface in the double mutant protein by facilitating solvent penetration into the core.
We have shown that abolishing the chromophore from the structure of sfGFP abolishes hysteresis seen in folding. We acquired 1-D spectra of sfGFP and trapped sfGFP as described for Figure 7 (Figure 10). Upon dilution or desalting of guanidine from the 1.8M sample, fluorescence is maintained at the same level as that observed in the trapped species described in Figure 7 (data not shown).
One-dimensional 1H spectra have similar chemical shift dispersion, showing that overall structure is the same between sfGFP in both native and “trapped” states (Figure 10a). However, upon close inspection, overlaid spectra show subtle changes in both the amide and aliphatic regions of the spectra. Changes around 0 ppm (Figure 10b) and 6.5 (Figure 10c) ppm are expanded for clarity. Subtle changes also appear at 8.2, 2.3, 7.4, 7.9, and 8.8 ppm.
Superfolder GFP was generated by directed evolution and demonstrates 100% fluorescence recovery in refolding experiments, a significant improvement for molecular reporting21. Interestingly, equilibrium unfolding and refolding experiments are non-coincident (Figure 2a), and at first glance, might suggest lack of complete reversibility in folding and/or aggregation. However, experiments with sfGFP protein which has been unfolded and refolded generate the same equilibrium and kinetic unfolding curves as protein which has never been unfolded (Figure 2b). In addition, there is no evidence of protein concentration dependence, as equilibrium curves were the same over a ten-fold protein concentration.
The non-coincidence of the sfGFP unfolding and refolding transitions shows apparent hysteresis. Hysteresis is a property in which a system does not immediately respond to the forces applied to it, and may arise from a bifurcation in the energy landscape. This, in turn, leads to a bistable system where the observed equilibrium is dependent not only on the final conditions, but also the initial (historical) conditions33. Common manifestations of hysteresis are lag effects (not necessarily temporal) between an applied force and the resultant final (not necessarily equilibrium) state, bistability, as well as memory effects38; 39; 40. Hysteresis was originally described in magnetic systems where an applied magnetic field will magnetize a material; once removed, the material will retain the magnetism. This property is exploited in magnetic storage, such as magnetic tapes and hard drives.
Hysteresis has been seen in protein folding, although it is typically limited to multimeric41; 42; 43; 44 and modular repeat proteins such as seen in the unfolding/refolding of titin 45; 46, a molecule composed of 21 repeat domains. Different activation energies for folding and unfolding between domains may cause hysteresis curves if unfolding is controlled by domain-by-domain transitions while refolding is a global process. Interestingly, denaturant-induced unfolding and refolding of trimeric type III collagen also shows hysteresis in which proline cis-trans isomerization may be a factor due to a proline rich region47. A mechanism which explained the hysteresis observed in collagen folding has been proposed. Similar to titin, hysteresis is thought to be caused in type III collagen refolding due to slow annealing from loop rearrangement, compared to very cooperative denaturation; unfolding occurs on a local scale, while refolding requires a global process48. While hysteresis is rare in single domain, monomeric proteins, it has been observed in single molecule folding experiments where ensemble averaging is removed, as demonstrated in RNaseH49, and GFP.
Single molecule studies of GFP unfolding shows the presence of intermediates in mechanical unfolding attributed to the detachment of single β-strands from the native structure 50. Further experiments on GFP show that that when mechanically unfolded from linkages besides the N or C terminals, there are multiple pathways of unfolding, which share similar stability51. Multiple pathways are one result of hysteresis, although unfolding mechanisms described may be different under mechanical stress versus chemical denaturation.
A simple explanation of the apparent hysteresis is that the slow folding and unfolding kinetics of sfGFP are not conducive to reaching folding equilibrium. Although equilibrium titrations appear to be static after 96 hours (Figure 2), predicted kinetics within the center of the chevron plot (Figure 4) estimate equilibrium to be reached after two months. Importantly the refolding reaction maintains two-state behavior and shows no change in the observed transition after 3 months (Figure 8). Conversely, the unfolding transitions shift over at least 3 months and maintains three-state behavior, consistent with the presence of a native-like intermediate (N*) seen in cycle3 GFP unfolding35. If insufficient equilibration time were responsible for the observed hysteresis effect, we would have seen changes in both the unfolding and refolding transitions with time. However, the fact that only the unfolding transition continues to shift with time indicates that a different barrier controls the unfolding and refolding reactions. These barriers must also be large as recently described for the irreversibility in the assembly of the SNARE fusion machines52; 53, and in GFP35. In native-state hydrogen exchange studies fluctuations within the native state allow population of high energy species and labeling of interior residues,54; 55. Similarly, fluctuations in the native state of sfGFP under destabilizing conditions allow for the transient population of fully unfolded protein, which can only return to a lower energy state via the landscape used for the refolding reaction.
Hysteresis can arise from a bifurcation in the energy landscape. This, in conjunction with large kinetic barriers, would constitute a rough energy landscape, as described in protein folding theory. Multiple energy wells (intermediates) are observed often in protein folding, so why is hysteresis such a rare observation?
Random heteropolymers sample many configurations with little bias and the resulting landscape is described as frustrated. As these heteropolymers could be easily trapped in glassy transitions, hysteresis may be seen. The ground state of a random heteropolymer arises from competition between conflicting energy contributions and the resulting landscape is described as frustrated. If proteins behaved like random heteropeptides, even a minor change in sequence could cause an alternate structure to become the new ground state. Thus, the energy landscape for a random sequence would be rugged and the dynamics glassy56; 57. This type of landscape would not be conducive to reliable protein fold and function. However, a rough landscape may show hysteresis, as these heteropolymers could be easily trapped in glassy transitions.
Proteins have evolved to robustly fold so that spurious single-site mutations rarely affect the global fold. In this case, interactions are not in conflict as described above for the random heteropolymer, but are supportive and cooperatively lead to a low-energy structure. Thus, stable protein folds have mutually supportive, cooperative interactions in the native structure said to be “consistent” or “minimally frustrated”. Energy decreases, on average, as sampled structures approach the native state and the energy landscape is “funneled”56; 57. This funneling predicts simple kinetics for small single-domain proteins, in agreement with experimental results58. Hysteresis is thus a rare occurrence. Therefore, does the novel global fold of GFP lead to the hysteresis observed in folding studies?
The crystal structure reveals a unique 11 stranded β-barrel surrounding a central α-helix, which contains the chromophore (Figure 11) 59. GFP must be folded into a barrel structure prior to de novo chromophore formation12. The mechanism of chromophore formation in GFP has been proposed to be facilitated by mechanical stress induced by the β-barrel scaffold in the immature native protein. Tension which concentrates force on the central α-helix is proposed to create an energetically unfavorable “kink”. Subsequent chromophore formation could reduce the strain in the α-helix by accommodating the kink with new stabilizing bonds forming the chromophore60. The central α-helix may be positioned in the central core of the protein by a hydrophobic patch of conserved amino acids Phe27, Leu60, and Leu12561. It is possible that a combination of hydrophobicity and mechanical stress is able to strain the helix enough to distort it, setting the stage for chromophore maturation through catalysis by E222 and R96.
Chromophore formation involves a novel, irreversible backbone rearrangement of the tripeptide SYG (residues 65–67) and a cyclization/dehydration and oxidation reaction, to create the para-hydroxybenzylideneimidazolinone chromophore (Figure 11, green)12. The actual order of the reactions is still under debate37; 62; 63; 64. Once the chromophore is formed, the protein matrix provides steric restraints, keeping it in its native, twisted state 36; 65. In simulated chromophore compounds free of the β-barrel though, the chromophore is planar, as expected in unfolded GFP65; 66. This non-planarity as well as cis-trans isomerization seen in the chromophore of native GFP is also seen in crystal structures of kindling fluorescent proteins67. Enoki et.al.16 report a “breathing” mechanism in the early folding kinetics of cycle3 GFP between the tryptophan and chromophore fluorescence that is consistent with the cis-trans proline isomerization observed in NMR studies of the native state68. The breathing mechanism is described as the tryptophan residue being buried in a native-like structure, released, and then reburied again and quenched by the chromophore, possibly due to proline and chromophore isomerizations hindering folding. Our NMR results show a similar overall fold, with subtle changes (Figure 10), possibly related to isomerizations in prolyl-peptide bonds and the chromophore.
The structure of GFP has two characteristics that may lead to a rough landscape. First, the chromophore is slightly twisted in the native form, whereas it is planar in the unfolded form (see above). Second, the central helix shows many deviations from optimal geometry. Our results indicate that the sequence variant, R96A, has impaired chromophore formation and no hysteresis (Figure 9a). In addition, we found through scanning mutagenesis that mutation of residues N-terminal to the prolines in the central helix (Figure 11), not only abolished efficient chromophore formation, but also removed the hysteretic effect (Figure 9b). Interestingly, in X-Pro dipeptides, the X residue can alter the cis-trans ratio of the X-Pro bond in peptides and unfolded states of proteins69. Thus, these results suggest that these two prolines may also be responsible for the proline isomeraztion limitation observed in the kinetics of folding of GFP. Prior to this study, only the chemical steps in the maturation of the chromophore had been identified. Our results may pinpoint structural requirements on the helix capping regions for efficient chromophore formation.
Inspection of the crystal structure leads us to speculate that the observed alteration in maturation in the Y74M/M88Y sfGFP reflects a synergy between proline isomerization state at the ends of the helix and efficient chromophore formation. Taken together, our results indicate that the hysteretic effect in sfGFP is linked to the formation of the chromophore, and removal of this feature generates a less-frustrated energy landscape for this protein.
The funneled organization of the energy landscape dominates the kinetics of folding for highly evolved and/or well designed proteins 70; 71. Proteins, however, do not have one landscape for folding and a different one for function. There is some speculation that functional residues are a hindrance to folding72; 73; 74; 75. Interestingly, our recent studies on the β-trefoil protein interleukin-1β indicate that the residues within and contacting the functionally significant β-bulge may introduce “frustration” into the folding of this protein76; 77. Therefore “frustration” associated with conserving specific functional sites may alter not only the folding trajectories, but the transitions observed in native proteins. Thus it appears that the native basin of the energy landscape dictates structural changes related to function78; 79.
The results presented herein indicate that the energy landscape of GFP is particularly rough, leading to hysteresis. However, the landscape can be made smoother by hindering chromophore formation. Other post-translational modifications such as glycosylation80, deamidation81, and phosphorylation82 also appear to change the landscape and native basins of these proteins. Therefore, unlike the case observed for random heteropoymers, a specific structural feature of GFP leads to frustration in the energy landscape. More precisely, a cyclization in the polypeptide chain frustrates the landscape, as seen in simulation of small polypeptides83. The de novo folding pathway of GFP proceeds prior to chromophore maturation while the refolding of GFP proceeds in the presence of the already cyclized chromophore. As chromophore formation requires the folding of the barrel, the de novo folding pathway should lack hysteretic behavior and have the smooth energy landscape more typical of 11-stranded β-barrel protein folds.. Consistent with this hypothesis, the apparent hysteresis is ameliorated in the presence of mutations that impede chromophore formation.
Recombinant superfolder GFP (sfGFP) was subcloned into the kanamycin resistant pET28(a+) vector and the resultant plasmid transformed into BL21 (DE3) Escherichia coli cells for expression. Mutations were introduced using primer-directed mutagenesis (R96A and Y74M/M88Y) into sfGFP. Expression and purification were performed similarly to previous methods21, with some changes. Cells were grown in Luria Broth at 30°C to an OD600 of 0.6, and protein expression was induced with 1mM IPTG for 5 hours at 25°C. Cell pellets were re-suspended in buffer (pH 7.9) containing 0.5M NaCl, 20mM Tris-HCl, 5mM imidazole, 0.1mM EDTA, 1mM DTT, 10mg/mL PMSF and sonicated on ice (Sonic Dismembrator 550, Fischer Scientific) with 30 s pulses separated by 30 s breaks for a total of 20 minutes.
The lysate was cleared by centrifugation at 14,000g for 30 minutes. The soluble sfGFP was bound and eluted from a Ni-NTA resin (Novagen) according to manufacturer’s protocol. sfGFP was purified further on a Sephacryl S-40(Pharmacia Biotech) size-exclusion column to remove His-rich proline isomerase in 25mM NaPO4 and 100mM NaCl at pH 6.5. Fractions containing sfGFP were analyzed by SDS-PAGE for purity and the purified protein was collected and concentrated and stored in 25mM NaPO4 and 100mM NaCl at pH 6.5. Yield was determined by a theoretical extinction coefficient (ε=31519 M−1cm−1).
Chromophore fluorescence spectra were obtained using the Fluoromax-2 spectrofluorimeter (Spex Edison, NJ) with an excitation wavelength of 450 nm and the emission measured at 508 nm unless otherwise noted. Circular Dichroism data were collected on the 60-DS spectrophotometer (Aviv Instruments). Equilibrium unfolding experiments were prepared with protein samples (0.1 mg/mL in 12mM Tris-HCl at pH 6.8 and varying Gdn-HCl concentrations ranging from 0 to 7.5M, at a temperature of 22 °C. Experiments were equilibrated for 96 hours, unless noted otherwise. Refolding curves were prepared by equilibrating sfGFP in 7.6 M Gdn-HCl and subsequently refolded by dilution to varying final Gdn-HCl concentrations and equilibrated for 96 hours. Unfolding transition curves using previously refolded sfGFP were prepared by equilibration in 6.4M Gdn-HCl, subsequent refolding to 1M Gdn-HCl overnight (20 hours, approximately 1000 half-lives), concentration of the refolded protein, and subsequent dilution to varying final Gdn-HCl concentrations with a protein concentration of 0.1 mg/mL sfGFP. The fraction folded by chromophore fluorescence was monitored at 508nm, after excitation at 450nm (3nm slit width). The fraction folded using tryptophan fluorescence was determined from the average wavelength (equation 1), after excitation at 295nm (5nm slit width)
Measurement of the unfolding relaxation times, τu, which is dependent on the final Gdn-HCl concentration were performed by manual-mixing fluorescence spectroscopy, using a Fluoromax-2 spectrofluorimeter (Spex), with the excitation wavelength set to 450 nm and the emission intensity monitored at 508 nm. The proteins were unfolded over a range of denaturant concentrations between 5.0 and 8.0M in 12.5mM Tris-HCl at pH 6.8, 25°C. The time-dependent decay in the observed fluorescent signal was adequately fit by a double exponential equation. The dead-time for the manual mixing experiments was ~10 sec. For refolding experiments, the protein was unfolded in 8M Gdn-HCl at a protein concentration of 2.7 mg/ml and allowed to equilibrate. Refolding was performed by dilution of the unfolded protein with the appropriate amount of buffer and Gdn-HCl to obtain the desired final Gdn-HCl concentrations of 0.2–1.2M. The final protein concentration in refolding experiments was 0.1mg/ml unless stated otherwise.
The double jump kinetics were performed by rapidly mixing a concentrated stock protein solution (33 mgs/ml) with Gdn HCl solution to a final denaturant concentration of 7.4 M using manual mixing techniques under conditions previously described for kinetic analysis. After a variable delay (aging) time, 30 μl of the protein/7.4 M GdnHCl solution was injected into a cuvette containing 970 μl of buffer solution, thermostated at 25°C. The solution was rapidly mixed and the refolding of the protein monitored as described above. The final guanidine concentration during refolding was 0.22 M.
Stopped-flow fluorescence studies were performed with an Applied Photophysics Pi*-180 stopped flow instrument (Applied Photophysics, London). Data were collected in logarithmic mode with an excitation wavelength of 450nm and emission collected using a 500 nm cut off filter with a dead time of 5ms. Typical refolding spectra were averaged over a minimum of five runs. Refolding experiments were initiated by a 1:11 dilution of 1.1mg/mL unfolded sfGFP (equilibrated in 7.73 M Gdn-HCl in buffer containing 12.5 mM Tris (pH=6.8)) into varying final concentrations of Gdn-HCl. The final concentration of sfGFP was held constant at 0.1 mg/mL, unless otherwise noted in the manuscript. Unfolding experiments were initiated by a 1:11 dilution of 1.1 mg/mL sfGFP with 12.5 mM Tris (pH=6.8) into varying final concentrations of Gdn-HCl and 12.5mM Tris (pH=6.8).
Kinetic studies were performed as described in the manual mixing section of this manuscript. However, the initial conditions for the individual kinetic experiments correspond to the protein equilibrated at conditions of 4.5 M (unfolding) and 1.8M (refolding) Gdn-HCl, respectively. The initial samples were prepared as described under the stability measurements section. We chose the initial conditions to correspond to points on the unfolding and refolding equilibrium transition.
The kinetic data were fit to a series of exponentials:
Where A(t) is the optical signal at time t, A is the optical signal at infinite time, Ai is the signal change associated with each phase (i), and τ1 is the relaxation time for each individual phase. By combining the stopped-flow and manual-mixing traces the data was globally fit using in-house software. The number and magnitude of the exponential phases was determined as previously described24.
All equilibrium data was fit to either a two-state or three-state model. Both models assume a linear dependence of ΔGunf on denaturant concentration (m). The two-state model is:
Where [D] is the guanidine concentration, m is the linear dependence of ΔGN↔U on denaturant concentration, and ΔGH2ON↔U is the free energy of unfolding at 0M denaturant, and SN and SU are the signal of the native and unfolded states, respectively.
The three-state model is similar to the two-state model:
Where KN↔I and KI↔U are as described above, and Z is a signal reweighting for the intermediate signal with the native and denatured signal:
Data was fit through a non-linear least squares minimization using in-house scripts and the curve-fitting tool in Matlab.
All NMR experiments were performed at 315 °K on a Bruker Avance800 equipped with a triple-resonance triple-axis gradient probe. Protein samples were prepared as described above, with the addition of an additional quick-spin column step, to remove additional guanidine from a trapped species. Trapped species were prepared as described above by unfolding sfGFP at 6M guanidine-HCl for 10 minutes, and then jumped to 1.8 M guanidine-HCl for 96 hours. The trapped species corresponds to as sample in the transition of the refolding curve, shown as black boxes in Figure 7. Final conditions for each sample were 30mM phosphate buffer at pH 6.8, and 100mM NaCl, 95%H2O/5% D2O. Data were processed using Felix 2004 software (Accelrys, San Diego, CA). One-dimensional 1H spectra were acquired for sfGFP, R96A sfGFP, and trapped sfGFP (described above). Spectra were acquired with a 20 ppm spectral width using a 1-D with Watergate suppression84; 85.
We would like to thank the members of the laboratory for discussions of the work, including Melinda Roy for thought provoking and insightful discussion. Work was funded by the Molecular Biophysics Training Grant (GM 08326)(BTA), LANL (GW/PAJ) and NIH grants (GM 54038 and DK 54441)(PAJ).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.