|Home | About | Journals | Submit | Contact Us | Français|
Fluorescent Proteins (FPs) have revolutionized cell biology. The value of labeling and visualizing proteins in living cells is evident from thousands of publications since the cloning of Green Fluorescent Protein (GFP). Biologists have been flooded with a cornucopia of FPs; however, the FP toolbox has not necessarily been optimized for cell biologists. Common FP plasmids are suboptimal for FP-fusion protein construction. More problematic are commercial and investigator-constructed FP-fusion proteins that disrupt important cellular targeting information. Even when cell biologists correctly construct FP-fusion proteins, it is rarely self-evident which FP should be used. Important FP information, such as oligomer formation or photostability, is often unsearchable or anecdotal. This brief guide is offered to assist in correctly exploiting FPs in cells.
Hundreds of reviews, books, methods chapters, and websites are devoted to FP technology, selecting FPs for their physical features, and describing an ever-expanding list of applications for FPs [1-11] (http://micro.magnet.fsu.edu/primer/techniques/fluorescence/fluorescentproteins/fluorescentproteinshome.html). GFP and other FPs have become pervasive in modern biological sciences and were recently recognized with the 2008 Nobel Prize in Chemistry. Despite the ubiquity of GFP, its use does not conform to a standardized one-size fits all protocol. Some companies, such as Clontech (Invitrogen) offer a User's Manual for FPs. However, much of the information provided concerns the FPs, themselves, and not much information concerning FPs in the context of fusion proteins in cells, one of the most popular FP applications. To assist novices and expert users, alike, in good practices and avoiding costly mistakes, I have prepared this guide for cell biology applications of FPs.
GFP and the other FPs are all inherently fluorescent proteins. Osamu Shimomura was able to purify GFP from the jellyfish Aequorea victoria and demonstrated the protein emitted bright green fluorescence when illuminated with ultraviolet light (Figure 1A) . For several years, it remained unknown whether the protein required additional jellyfish factors to create the fluorescent protein or if the protein might autocatalyze the formation of the fluorophore. Doug Prasher successfully cloned the GFP gene from jellyfish in 1992  and hypothesized GFP could form a fluorescent molecule in a heterologous environment. Martin Chalfie obtained the clone from Prasher, expressed it in E. coli , and confirmed that GFP could fluoresce, even if expressed in organisms from different biological Kingdoms. It was this finding that ushered a new era in cell biology, in which proteins of interest could be visualized with genetically encoded optical tags in live cells (Fig. 1B) or even whole animals. Work by the laboratories of Roger Tsien, Atsushi Miyawaki, Konstantin Lukyanov, and many others led to a deep understanding of the GFP structure (Figure 1C), the mechanism of fluorescence, and resulted in enhanced GFP (EGFP, with improved fluorescence and expression properties), as well as the generation of dozens of FPs of different colors (Fig. 1D) and unusual properties such as photoactivation . Armed with a toolbox of powerful reagents and modern microscopes, biologists can now follow the spatial and temporal dynamics of cells, organelles, and individual proteins with high resolution.
When purchasing or requesting an FP plasmid, you will often be asked to select the “N” or “C” version. These terms refer to the original Clontech (Invitrogen) EGFP plasmids and indicate the position of the multicloning site relative to the FP. N constructs place your protein of interest at the NH2-terminus of an FP and C places your protein at the COOH-terminus of an FP. For ease of subcloning into other FP plasmids, nearly all FP cDNAs have been integrated into these plasmids. The N and C plasmids contain a resistance marker suitable for both bacterial selection and generation of stable mammalian cell lines (Kanamycin for bacteria and G418 for mammalian cells). Both plasmids utilize one of the strongest promoters available (CMV), which will help produce robust levels of FP or FP fusion proteins, but will probably substantially overexpress most cellular proteins. The N and C plasmids are excellent for simply expressing an untagged FP, but have some issues for construction of some fusion proteins (see below).
Currently, there are dozens of FP options available when designing an experiment. Which one to select? The answer can change every few months as improved FPs are reported in the literature, though newer does not always equal better. Whichever FPs one is considering, there are some key features fundamental for any FP experiment. Spectral and biochemical properties are important for FPs and these are usually provided either in the original paper describing the FP or a company's data sheet (see Box 1). With few exceptions, investigators need the brightest, most photostable, least phototoxic, and fastest folding FP to achieve robust fluorescent signals. For FP-fusion proteins, our lab primarily uses the FPs listed in the first part of Table 1. Superfolder GFP is showing great promise for fusion protein constructs that appear comparatively dim, most likely because the fusion proteins are interfering with FP folding (see ref , especially Figure 4). That is, just as FPs can affect the functionality of a fusion protein (see following sections), a protein of interest can disrupt FP folding and affect the FP fluorescent signal.
The majority of FPs have been developed from jellyfish and coral proteins. One major difference between these animals and mammals is the choice of amino acid codons. In 1996 Brian Seed and colleagues  improved the expression and fluorescent signal of GFP in mammalian cells by 40-120 fold by heavily modifying the GFP codon sequence to reflect mammalian cell preferences. Most commercially available plasmids have been codon optimized for mammalian cells. However, some of the older FP plasmids lurking in lab freezers may not have been optimized and one GFP may not be equivalent to another. Mammalian codon optimized FPs are not necessarily optimized for other model organisms such as Drosophila or yeast. Investigators working in nonmammalian systems should consider synthesizing codon-optimized variants of FPs (currently about $350 for the average FP) for their model organism.
Even if an FP is codon optimized, it may not necessarily be suitable for some cellular environments. For example, TagRFP and mKO contain multiple cysteines and consensus N-glycosylation sites (N-X-S/T, X is any amino acid except proline), which could modify the folding, size, and oligomerization of these FPs if targeted to the secretory pathway of eukaryotic cells . Even EGFP and its variants contain two cysteines, which can lead to disulfide-bonded oligomers in the endoplasmic reticulum (ER) . Under more extreme conditions, EGFP cannot correctly fold or fluoresce in the highly oxidizing environment of the periplasmic space of gram-negative bacteria . In contrast, the cysteine-less mCherry readily folds in the same environment . Therefore, always carefully examine FP amino acid sequences for potentially environmentally sensitive sequences.
Many FPs have a tendency to oligomerize either as part of the inherent structure (i.e. DsRed is an obligate tetramer) or when present in high concentrations on membranes or in oligomeric proteins (i.e. EGFP). Therefore, it is important to determine whether an FP is monomeric and whether this matters for your experiment. While FP oligomerization has become more commonly reported, the propensity of an FP to oligomerize is often unknown, as oligomerization assays are not always robust or quantitative. Many papers describe an FP as monomeric without directly demonstrating monomerization or without reporting a Kd value. This point is not merely academic. It can be difficult or expensive (up to $500 per plasmid) to obtain an FP plasmid and you will probably not be in a great mood if that $500 FP oligomerizes with your FRET biosensor or integral membrane fusion protein. Currently, there are no accepted standards for how monomeric an FP needs to be for cell applications. Some researchers fuse new FPs to tubulin or actin to determine whether cytoskeletal structures correctly form. However, such assays missed the effects of EGFP dimerization under other physiologic conditions (see below). Therefore, investigators must confirm that an FP-tagged protein behaves similarly to untagged proteins in assays and environments relevant to the protein of interest.
FP oligomerization matters because FPs considered monomeric have been revealed as dimers at sufficiently high concentrations in cells. For example, EGFP forms dimers when fused to integral membrane proteins or incorporated into oligomeric proteins . As a consequence, fusion FPs can form inappropriate interactions leading to false positive FRET signals  or distortion of cellular organelles . For fusion proteins, the FP must be truly monomeric. Fortunately, EGFP and variants (CFP and YFP) can be monomerized with a single point mutation (A206K) [20, 21].
FPs can be expressed as free proteins either constitutively or under the control of a promoter of interest. There are few restrictions on the choice of FP for these experiments other than identifying a sufficiently bright FP. Tandem dimer FPs, i.e. tdKatushka2 and tdTomato, are excellent choices because they have two copies of an FP making them exceptionally bright . If FPs are being used as reporters of promoter activity, then chromophore formation time may be a consideration. Fast folder FPs, such as mCherry and Venus, will rapidly report promoter activity. Note that such FP reporters offer little insight into message stability and generally reflect both cumulative promoter activity and stability of the fluorescent protein, as fluorescent proteins typically have 24h half-lives . To enhance FP turnover, several groups have attached proteasome degrons to FPs and achieve protein half-lives of ~2h . Alternatively, another class of FPs, fluorescent “timers” change color with age and provide relative measures of ratios of recently synthesized and old FPs [24, 25].
Visualizing a protein's distribution and dynamics in a subcellular compartment has opened new opportunities in cell biology [26-28]. Correct design and characterization of FP fusion proteins are essential for interpretation of FP fusion protein studies.
Some investigators take short cuts and “clone by phone.” While it is tempting to rely on others to create FP fusion proteins of interest, there are important reasons for making your own constructs. One must be skeptical of any constructs received from other labs or companies. Not all constructs are made with consideration of protein targeting domains (see below). Also, many FPs aren't always correctly labeled. For example, a DsRed construct could be the monomeric or tetrameric form. Another example happened to me. I often perform photobleaching experiments to study the protein dynamics of GFP-fusion proteins and a fundamental requirement for these experiments is that the FP photobleaches irreversibly. Once, GFP fusion proteins from a collaborator produced unexpectedly rapid protein mobilities in cells. Sequencing revealed the GFP contained the three EGFP mutations and two additional mutations reported to enhance brightness. Control experiments revealed that this GFP, unlike standard EGFP, underwent nearly 80% reversible photobleaching (also termed photoswitching) (our unpublished results and see studies by [16, 29, 30]). Not all “EGFPs” are equal! Whenever obtaining an FP construct from another lab, politely request a plasmid map and a sequence file. If a sequence file is not available, sequence the FP construct yourself before performing any experiments. Don't work with mystery reagents! This anecdote also illustrates the importance of collecting stable baseline values for time resolved fluorescence experiments to help identify phenomena such as photoswitching. Finally, unusual photophysical properties of FPs aren't always problematic. They can be exploited to develop new imaging techniques. For example, photoswitching plays an important role in the super resolution technique of PALM ( and see the article by Jennifer Lippincott-Schwartz in this issue).
Whenever an epitope tag (ANY epitope tag, EGFP, myc, His, HA, etc.) is added to a protein, the tag may modify protein function either by sterically blocking protein interactions with substrates or disrupting targeting sequences (see next section). Knowledge of your protein and engineering the epitope tag to avoid disrupting protein function or targeting can circumvent such issues.
An antibody against your native protein is a key reagent for any epitope-tagging experiments. An antibody can confirm that one's tagged protein: 1) localizes correctly by immunofluorescence, 2) is the correct size and expressed at levels similar to the untagged protein in an immunoblot, and 3) interacts with known substrates in a co-immunoprecipitation. Simply tagging a protein with an FP to avoid having to make an antibody will not address all of these important points. Any FP-fusion localization or related information must be independently verified with an antibody to confirm the FP hasn't disrupted protein behavior or localization.
Besides an antibody, fusion protein studies require the availability or development of a functional assay. The importance of a functional assay cannot be overstated. Even if your tagged protein localizes correctly in a cell, it is critical to confirm your tagged protein behaves as the native protein. The point of adding an FP to a protein is to monitor the localization and dynamics of the protein of interest in cells. A nonfunctional FP-tagged protein will be uninformative at best and most likely misleading. Some examples of FP-tagged proteins with demonstrated functionality are listed in the second part of Table 1.
After selecting a bright monomeric FP, establishing a functional assay for your protein of interest, and obtaining a good antibody against your native protein, you can decide where to place the FP. Significant knowledge of the protein of interest is essential to successful FP fusion design and care should be taken to ensure that the FP fusion does not block the normal localization and functionality of the protein of interest. A critical factor in FP placement involves knowledge of the different types of protein motifs for targeting, retrieval, and retention, as well as the contextual and positional requirements of the motifs.
Many cellular proteins reside within organelles or subcompartments. Protein localization critically depends on information encoded within the protein's primary sequence . Protein targeting sequences frequently depend on the context and position of the sequence within the protein. Many protein-targeting sequences must be at the extreme NH2 or COOH terminus of the protein (see Table 2). For example, most secretory proteins will not enter the ER, unless the signal sequence is positioned at the NH2 terminus of the protein. Similarly, a resident ER protein requires that the ER retrieval motif (-KDEL or -KKXX) must be at the absolute COOH terminus of the protein to interact with the retrieval machinery. Thus, for example, placement of an FP before the signal sequence or after the ER retrieval motif will disrupt the correct localization of the FP-fusion protein. The positional requirements of a protein of interest's localization sequences will determine what sites are appropriate for fusing an FP.
Approximately 20 percent (~6,000 genes) of the human genome encodes secretory proteins . Another 1500 proteins localize in mitochondria, up to 8400 are in the nucleus, and 60 are in peroxisomes. Cytoplasmic proteins also can contain positionally dependent posttranslational modifications, such as myristoylation and palmitoylation. Together, at least one third of the genes in the human genome encode proteins with positionally dependent information. Thus, the tagging of each protein with an FP (or ANY epitope tag) requires a specific evaluation of appropriate and inappropriate positions for the FP relative to the protein of interest. The large number of proteins with targeting information suggests all potential fusion proteins should be analyzed for targeting sequences.
It is curious that numerous publications, often in top journals, employ one-size-fits-all FP tagging strategies for following the localization and behaviors of large arrays of proteins in cells. While it is clearly attractive to develop high throughput approaches to describe the latest “-ome,” a careful reading of the FP-tagging strategy may reveal serious issues with the approach and the associated data. Many protein targeting sequences have stringent position requirements. Placing an FP before or after a targeting sequence could mask the targeting sequence, disrupting correct targeting of a protein, and thus makes indiscriminate GFP-tagging of proteins a dubious practice (Box 2). For example, some studies have engineered an FP before the start or at the terminus of all open reading frames. The former approach will prevent most secretory proteins from entering the ER and mitochondrial proteins from translocating into the mitochondrial or addition of myristoyl groups. The latter approach will prevent retention of proteins in the ER and entry of proteins into peroxisomes. Thus, whole classes of proteins will be incorrectly targeted and incorrectly processed. The resulting data are of questionable value. Despite such concerns, some companies now offer thousands of cDNAs fused to EGFP at either the NH2 or COOH terminus. Inspection of a sample of secretory protein constructs, such as the luminal ER chaperone calreticulin, revealed that open reading frames with a signal sequence and a –KDEL retrieval motif had both EGFP fusion options, neither of which would be physiologically functional. Hardly worth $800! If you are interested in obtaining a pre-constructed FP fusion protein plasmid, one excellent resource is addgene.org. Published FP fusion constructs are available in a searchable database, have been well annotated, and are available for a modest fee of $65 per plasmid.
I do not wish to give the impression that every protein is a “mine field” of critical targeting domains. Rather, most positionally dependent targeting domains are found predominantly at the NH2 and COOH ends of the protein. This simplifies analysis and makes generation of FP fusion proteins relatively easy. Bearing in mind the importance of FP position, numerous studies have successfully created FP-tagged proteins with the functionality of the wild type untagged protein (Table 1). While FP fusion protein design (Box 2) requires significant knowledge of the protein of interest, targeting sequences are not always apparent in the primary sequence of the protein. Note that many of the sequences in Table 1 are not defined as absolute consensus sequences. This is because many targeting sequences have biochemically-defined properties, but lack a common primary sequence. For example, every secretory protein in the human genome has its own unique signal sequence that ranges in size from 14-70 amino acids . Web-based resources including GenBank, ExPASy, and SignalP 3.0 (http://www.cbs.dtu.dk/services/SignalP/) can assist in identifying signal sequences, for example. Given these complexities, FP-tagging is not a recommended approach for characterizing novel or poorly studied proteins.
As most FP plasmids are in the form of the Clontech N vector, there is an additional consideration for FP fusion design. The N construct contains a strong mammalian Kozak sequence and an initiating methionine for the FP. The design is great for expressing an FP by itself, but can be suboptimal for fusion proteins, as the FP potentially could be translated independent of the attached fusion protein sequence, possibly due to leaky ribosomal scanning (our unpublished data and ). To reduce the potential for such phenomena, PCR amplify the FP sequence without a methionine or Kozak sequence and fuse it in frame with the cDNA for the protein of interest. Once constructed, confirm the FP fusion protein sequence, functionality, localization relative to the untagged parent protein by immunofluorescence, and fusion protein size with an immunoblot. Now, you are ready to unlock the full potential of FP fusion proteins in living cells or even whole organisms.
The pace of FP development has created the need for a centralized Internet FP resource site. FP-related web sites are available. However, the information and tools tend to be spread out over multiple Internet sites and dated. Instead, the user community needs to develop a freely accessible and searchable FP resource website that can be updated similar to a WIKI page or GenBank. Users should be able to access both the nucleotide and amino acid sequences, spectra and fluorescent properties of all FPs, and notes on use, including an FP's oligomeric state, pKa, and links to related older and newer FPs. An ideal site would also provide a widget for overlaying multiple FP spectra to aid in experimental design. Finally, a user comment section with matters arising for each FP could help alert other users of FP applications and FP caveats. Given the success of GenBank, EXPASY, and other resource websites, an FP website should be possible and would be of great utility to everyone developing and using FPs. With better organization and accessibility of FP information, the FP toolbox will be fully exploitable for all researchers.
The spectral properties of FPs determine whether an FP can be used with an investigator's particular imaging apparatus and whether other FPs can be practically used in the same cell or experiment. These properties include:
the wavelengths of light needed to excite a fluorophore.
the wavelengths of light produced by the excited fluorophore. The absorbance and emission spectra can be quite broad, which will impact the imaging setup and whether an FP can be combined with other fluorophores. Therefore, it is essential to know the full spectra of the FPs and the properties of the instrument to be used. Many core facilities have this information. If filter information isn't available in a lab manual or the microscope software, the information can be found on the fluorescence filters themselves. To access your fluorescence filters, refer to the user's manual, extract the fluorescence filter cubes, and the information is on the filter cubes (see http://www.olympusmicro.com/primer/techniques/fluorescence/filters.html). With the instrument filter spectra, one can determine whether the correct excitation light sources and emission filters are available for FP experiments.
the product of Quantum Yield and Extinction Coefficient and provides a useful reference for whether an FP will be sufficiently bright for an experiment. Brightness of FPs can be compared relative to EGFP (30,000M−1cm−1) or spectrally related FPs. The practical consequence is studying a protein expressed at low levels (i.e. most kinases and transcription factors) requires the brightest possible FP, whereas an abundant protein (i.e. tubulin, actin, GAPDH, chaperones) may permit use of a dimmer FP that with more optimal spectral characteristics.
Brightness is reported only for the completely folded protein. Therefore, in cells, the rate of maturation of an FP may be as important as FP brightness. Maturation can range from minutes to hours and is often reported in papers. However, the methods for determining and reporting maturation vary. One should determine whether the reported rate refers to immature nascent proteins that have not yet formed chromophores or mature proteins that have been denatured and then timed for reappearance of fluorescence after removal of the denaturing conditions. The latter values are primarily for in vitro studies, as FP refolding is not a general concern in cells. Another caveat concerning maturation rates is that several studies report maturation under low oxygen conditions. In cell culture, oxygen is more abundant and FP maturation rates are much faster.
defines how long a population of FPs can be continuously excited before photobleaching or destroying the fluorophore. This value is often provided for arc lamp excitation and laser excitation. In general, select FPs with high photostability (longer halftimes) to enable prolonged imaging of cells.
Optimal FP fusion protein design requires significant knowledge of the protein of interest. FP tagging is not generally a recommended approach for characterizing novel or poorly studied proteins. Instead, the investigator should have as much information about a protein as possible to ensure that the FP can be placed in the least perturbing location for the protein of interest. This is discussed in great detail elsewhere . Briefly, FPs can be modified and placed before or after a relevant targeting sequence using standard molecular biology techniques. For example, a resident ER luminal protein could have the FP tag engineered in between the signal sequence and the mature protein or between the mature protein and KDEL ER retention sequence (Fig. IA). Our lab has used PCR amplification to generate FPs with a KDEL sequence at the extreme COOH terminus. The full-length protein cDNA, minus the KDEL sequence, is then placed upstream of the FP (Fig. IB). To improve accessibility of interacting proteins to targeting domains, one can add short small amino acid hydrophilic linker domains of 2-6 copies of alternating glycine and serine.
Erik Snapp is an Ellison Medical Foundation New Scholar in Aging and is supported by NIA 1R21AG032544-01 and NIDDK 2PO1DK041918-16. Erik Snapp is a member of the Albert Einstein College of Medicine Marion Bessin Liver Center.