|Home | About | Journals | Submit | Contact Us | Français|
Proteins in the Karyopherinβ family mediate the majority of macromolecular transport between the nucleus and the cytoplasm. Eleven of the 19 known human Karyopherinβs and 10 of the 14 S. cerevisiae Karyopherinβs mediate nuclear import through recognition of nuclear localization signals or NLSs in their cargos. This receptor-mediated process is essential to cellular viability as proteins are translated in the cytoplasm but many have functional roles in the nucleus. Many known Karyopherinβ-cargo interactions were discovered through studies of the individual cargos rather than the karyopherins, and this information is thus widely scattered in the literature. We consolidate information about cargos that are directly recognized by import-Karyopherinβs and review common characteristics or lack thereof among cargos of different import pathways. Knowledge of Karyopherinβ-cargo interactions is also critical for the development of nuclear import inhibitors and the understanding of their mechanisms of inhibition.
Macromolecules need to be transported selectively and efficiently between the cytoplasm and the nucleus. Nuclear-cytoplasmic transport of macromolecules is signal-mediated: nuclear localization signals (NLSs) and nuclear export signals (NESs) in proteins direct them in and out of the nucleus, respectively. Nuclear transport proteins of the Karyopherin-β (Kapβ; also known as Importins and Exportins) family recognize NLS/NES and are responsible for most nucleocytoplasmic transport of proteins in the cell [1–5]. This receptor-mediated process is essential to cellular viability as proteins are translated in the cytoplasm but many have functional roles in the nucleus. Therefore, as transporters, Kapβs are critically involved in diverse cellular processes such as gene expression, signal transduction, immune response, oncogenesis and viral propagation.
There are 19 known human Kapβs and 14 S. cerevisiae Kapβs, each functioning in distinct nuclear import, export or bi-directional transport . Kapβs share similar molecular weights (90–150 kDa) and isoelectric points (pI = 4.0–5.0), low sequence identity (10–20%) and all contain helical HEAT repeats. Evolutionary analysis divides the Kapβ family into 15 subfamilies that are named according to their Uniprot gene names (Figure 1) [6,7]. Eleven human Kapβs and ten S. cerevisiae Kapβs can mediate import of proteins into the nucleus.
Each Kapβ recognizes a unique set of proteins or RNA, thus creating multiple transport pathways across the nuclear pore complex (NPC). Kapβs also bind weakly to phenylalanine-glycine or FG-repeats in NPC proteins known as nucleoporins, thus targeting Kapβ-cargo complexes to the NPC for translocation. Directionality across the NPC is determined by a gradient of the small GTPase Ran, which regulates Kapβ-cargo interactions [2–4]. In the nucleus there is ~100-fold greater concentration of RanGTP than in the cytoplasm due to the presence of Ran’s guanine-exchange factor RCC1, which is primarily bound to histones H2A and H2B [8,9]. In the cytoplasm the majority of Ran is in the GDP-bound form due to the cytoplasmic localization of its GTPase activating protein RanGAP, the RanGTP binding protein RanBP1 and homologous RanBP1 domains known as RBDs in cytoplasmic nucleoporin Nup358 (also known as RanBP2) . Kapβs bind preferentially to RanGTP in their N-terminal arches [5,11–13]; the binding affinity for RanGTP is in the low nanomolar range while the affinity for RanGDP is ~10 μM . Import-Kapβs bind cargos in the cytoplasm where RanGTP is absent and release them upon binding RanGTP in the nucleus. In contrast, export-Kapβs bind RanGTP and cargo cooperatively. While the directionality of transport is determined by the distribution of RanGTP, the localization of a specific protein is predominantly determined by the presence of an NLS and/or NES that governs the balance of nuclear import versus export.
Many transport cargos are known for the karyopherins Importin-β (also known as Karyopherin-β1 or Kapβ1; S. cerevisiae Kap95p), CRM1 and Karyopherin-β2 (Kapβ2; also known as Transportin or Trn; S. cerevisiae Kap104p), but few are known for the other Kapβs. Correspondingly, classes of NLS recognized by Importin-β, Kapβ2 and an NES recognized by CRM1 have been characterized but classes of signals recognized by the other Kapβs remain unknown [15–20]. The first characterized nuclear transport signal is the short lysine-rich classical-NLS, which was discovered in the early 1980s [15–17] and is recognized by the Importin-α•β heterodimer . More recently, common characteristics were identified for a group of significantly larger and more diverse sequences termed the PY-NLS, which is recognized by Kapβ2 . Most other import-Kapβs are currently known to recognize protein segments with very diverse sequences, making it extremely difficult to identify common characteristics that define their NLSs. Nevertheless, the large sizes and low sequence identities of Kapβs coupled with the identification of multiple cargo binding sites for several members [21,22], suggest that dozens of NLS classes have yet to be discovered.
The goal of this paper is to review Kapβ-mediated nuclear import pathways in humans and S. cerevisiae with a main focus on direct Kapβ-cargo recognition. Since the majority of known Kapβ-cargo interactions were discovered through studies of the individual cargos rather than the Kapβs, information about Kapβ-cargo recognition is widely scattered in the literature. Consolidation of such information in this paper may reveal common characteristics or lack thereof among cargos within and between import-Kapβ pathways. A short discussion of nuclear import inhibitors is also included. Kapβ-specific inhibitors should prove very useful for the biological community to determine nuclear import and export pathways of macromolecules, for in vivo validation of new cargos and for mapping potential redundancy in nuclear transport networks. Furthermore, since Kapβs such as Importin-β, Kapβ2 and CRM1 have been identified as key players in other non-transport processes, including mitosis, centrosomal duplication and nuclear envelope assembly [23–27], Kapβ inhibitors will be useful in studying these potential new functions. Knowledge of how Kapβs recognize their cargos is critical for designing inhibitors and for understanding the inhibition mechanisms of Kapβ-cargo binding and nuclear import.
Most Kapβs bind the NLS of their cargos directly. A significant exception is the Importin-β pathway, which recognizes the classical-NLS through adaptor protein Importin-α (Kap60p in yeast) . Classical-NLSs bind the Importin-α•β heterodimer with high affinity (KDs in the low nanomolar range) and the trimeric Importin-α•β•cargo complex translocates through the NPC into the nucleus. [28,29]. Importin-β also uses adaptor protein Snurportin 1, which binds to the m3G-cap of snRNAs, to import snRNPs [30–32]. The recognition of classical-NLS by Importin-α is discussed in detail in Marfori et al. .
The monopartite classical-NLS is a short (5–7 residues) and highly basic signal. Its compact and polar nature may favor locations in surface loops where the signal is accessible to bind in an extended conformation to Importin-α. Such short and simple motifs may be very prevalent in genomes, suggesting a huge transport load and the need for a very robust Importin-α•β machinery. Although the classical-NLS is the most prevalent nuclear targeting signal, the use of adaptor protein for indirect signal recognition by Importin-β is unusual among the Kapβs. Riddick and Macara  examined the need for adaptor proteins in nuclear import. Contrary to their original hypothesis and intuition, they found that direct Importin-β•cargo transport is faster (higher steady state nuclear accumulation and larger initial import rate) than transport mediated by the Importin-α•β heterodimer . However, since either an increase or decrease of Importin-β concentration in the cell can inhibit nuclear import of a direct Importin-β cargo, the direct pathway would seem to have a narrow window for optimal activity. In contrast, changing Importin-α concentration in the cell by siRNA knockdown or microinjection resulted in proportional changes in nuclear cargo accumulation suggesting that the adaptor pathway has increased dynamic range for control of import rates and a more flexible control of cargo gradients under different cellular conditions. Even though the Importin-α•β pathway requires more energy (the cell needs to export Importin-α from the nucleus, thus requiring two GTP cycles instead of one for direct transport), this pathway is suggested to have an evolutionary advantage, in particular in its increased robustness against environmental influences .
That the fast and efficient nuclear import mediated directly by Importin-β is physiologically important is implied by the fact that numerous protein cargos are imported through direct interactions with Importin-β (Table 1). Structures of four different Importin-β•cargo complexes are available and are discussed in detail by Marfori et al. . Adaptors Importin-α and Snurportin1 bind to Importin-β through their homologous Importin-β-binding or IBB domains (KDs of Importin-β binding to the Importin-α or Snurportin1 IBBs are both ~ 2 nM), which each contains a long basic helix [32,35]. PTHrP binds through an extended segment to the Ran-binding arch of Importin-β (KD ~ 2 nM ) and SREBP-2 binds through its dimerized helix-loop-helix domain to the central portion of Importin-β [37,38]. These structurally diverse cargos bind to multiple distinct binding sites of Importin-β and induce different conformations of the karyopherin [32,35,37–39]. This theme of diverse cargos binding to multiple Importin-β sites is likely to be repeated for many other Importin-β cargos. The binding determinants in many Importin-β cargos have been determined (Table 1) and range from those within extended segments (SRY, HIV-1 Tat and ribosomal proteins) to single helical elements (histones and CREB) to folded domains (NF-YA, Smad3, HTLV-1 Rex, HIV-1 Rev, TRF1, Arx, c-Jun, and PP2A) (Y. Chook, unpublished observations).
Kapβ2 is a prototypical Kapβ, which binds import cargos and nucleoporins simultaneously to target cargos to the NPC [40,41]. Kap104p is the S. cerevisiae homolog of Kapβ2 . Numerous cargos have been validated for Kapβ2 while three cargos have been confirmed for Kap104p (Table 2) . Many of the Kapβ2/Kap104p cargos are mRNA binding proteins and transcription factors [18,44].
Unlike the classical Importinα•β system, which recognizes the well-defined and compact classical-NLS, Kapβ2 recognizes a diverse set of sequences in its import cargos. Structural and biochemical studies have revealed several common characteristics amongst these apparently disparate Kapβ2 signals, unifying them into a new class of NLS termed PY-NLS . Even though the 15–30 residue PY-NLSs are larger and more complex than the classical-NLSs, they also bind the karyopherin in extended conformations indicative of linear albeit long epitopes and with affinities in the low nanomolar range [18,45,46] (Figure 2a). However, unlike the classical-NLSs, PY-NLSs that are diverse in sequence and structure cannot be sufficiently described by a traditional consensus sequence. Instead, they are described by a collection of weak but orthogonal physical rules that include requirements for intrinsic structural disorder of a large peptide segment, overall basic character, and a set of weakly conserved sequence motifs .
Diverse PY-NLS sequences are consistent with their weak consensus motifs composed of a loose N-terminal hydrophobic or basic motif and a C-terminal RX2–5PY motif . The composition of N-terminal motifs divides PY-NLSs into hydrophobic and basic subclasses (hPY- and bPY-NLSs; Figures 2b and 2c). hPY-NLSs contain four consecutive predominantly hydrophobic residues (consensus 1-G/A/S-3-4, where is a hydrophobic residue) while the equivalent region in bPY-NLSs is a stretch of 4–20 amino acids that are enriched in basic residues. Structural comparison of Kapβ2 bound to the PY-NLSs of cargos hnRNP A1, hnRNP M, hnRNP D, JKTBP and TAP/NXF1 explained recognition of these chemically diverse motifs [45,46]. Structural convergence of diverse PY-NLSs only at their consensus motifs indicated that these sites are key binding epitopes, confirmed their consensus designation despite weak sequence similarity, and suggested a multipartite nature of the PY-NLS (Figure 2b) . Comparison of the hnRNP A1- and hnRNP M-NLS complexes (KDs 42 nM and 10 nM, respectively) explained how Kapβ2 recognizes the chemically diverse basic and hydrophobic N-terminal motifs . The Kapβ2 surface that binds the N-terminal motifs is highly acidic with scattered hydrophobic patches that bind the hydrophobic motif of hnRNP A1. The aliphatic portions of the basic residues in hnRNP M-NLS bind these same Kapβ2 hydrophobic patches while the charged portions bind the large acidic surface of Kapβ2. The relatively large and flat NLS binding site of Kapβ2 (Figure 2a) and its mixed acidic/hydrophobic surface allows the karyopherin to accommodate diverse sequences, ranging from the hydrophobic segment in hPY-NLSs to basic groups in bPY-NLSs . Interestingly, the PY-NLS binding region of Kapβ2 remains relatively unchanged when bound to the different PY-NLSs [45,46].
The physical rules that describe the PY-NLS (structural disorder, positive charge and weak consensus motifs) are not strong filters individually, but together they are predictive and led to identification of ~100 candidate cargos of Kapβ2 by bioinformatics . These new NLS sequences are complex signals that were discovered using a collection of individually weak rules rather than just a strongly restrictive sequence motif. 13 of these proteins were previously known Kapβ2 cargos. Several new cargos and many PY-NLSs have also been validated and are listed in Table 2 . PY-NLS sequences are available in Süel et al. .
Two yeast Kap104p cargos were identified prior to the characterization of PY-NLSs in the mammalian system . mRNA processing proteins Nab2p and Hrp1p bind Kap104p through arginine- and glycine-rich signals known as the rg-NLS [47,48]. More recently, Y. Chook and colleagues showed that the rg-NLSs in Nab2p and Hrp1p shared the same characteristics as the mammalian PY-NLS, demonstrating that the PY-NLS is recognized by the two evolutionarily distant karyopherin homologs [43,49]. PY-NLSs in both Nab2p and Hrp1p contain basic N-terminal motifs. The signal in Hrp1 has a typical C-terminal RX2–5PY motif while Nab2p has a homologous C-terminal RX2–5PL motif . Interestingly, while human Kapβ2 recognizes both the hydrophobic and basic subclasses of PY-NLS, yeast Kap104p recognizes only basic PY-NLSs . Structural analyses of Kapβ2-NLS complexes and sequence analysis of the karyopherins explained the origin of this specificity [18,43,45,46]. Many NLS-binding Kapβ2 residues that are different in yeast and human contact the hydrophobic motif in hnRNP A1 whereas most Kapβ2 residues that contact basic sidechains in bPY-NLS remain unchanged. Similar analyses to track residues necessary for hPY-NLS recognition in various Kapβ2 homologs predict that metazoan but not fungal Kapβ2 should recognize hPY-NLSs .
Mutagenesis and thermodynamic analysis of Kap104p-NLS interactions revealed several physical properties that govern PY-NLS binding affinity . 1) The PY-NLS is a modular signal composed of three spatially distinct but structurally conserved linear epitopes that can be represented by a series of sequence patterns (Figure 2b). 2) Each linear epitope can accommodate substantial sequence diversity but the sequence limits for each are beginning to be defined. For example, although tyrosine appears to be rather conserved in the C-terminal RX2–5PY motif, it can often be replaced with other amino acids especially hydrophobic ones. Import cargos Nab2p and HuR have PL and PG motifs , respectively, while Kapβ2 binding partner ELYS, which functions in NPC assembly, may contain a PV dipeptide in its NLS  and ciliary transport cargo KIF17 appears to contain a PL dipeptide in its ciliary localization signal (CLS) that binds Kapβ2 . 3) All three linear epitopes of the PY-NLS are structurally and energetically modular, suggesting that the daunting search for the very diverse signal sequences may be performed in parts . 4) Finally, each linear epitope can contribute very differently to total binding energy in different PY-NLSs, explaining how signal diversity can be achieved through combinatorial mixing of energetically weak and strong motifs while maintaining affinity appropriate for nuclear import function . This collection of physical properties describes how functional determinants of PY-NLSs are organized and lays a path to decode this diverse and evolvable signal for future genome-wide identification of Kapβ2 import cargos.
All structurally characterized PY-NLSs bind at the same location on Kapβ2, the inside or concave surface of its C-terminal arch (Figure 2a) [18,45,46]. Based on the shared characteristics of their binding determinants, all PY-NLSs are expected to bind at the same location. Is the PY-NLS binding site the only cargo binding site in Kapβ2? Does Kapβ2 recognize other linear NLSs that are distinct from the PY-NLS? Does it recognize conformational NLSs where the entire NLS is a folded domain? Several proteins that are imported by multiple Kapβs including Kapβ2 do not seem to contain PY-NLSs (examples include ribosomal proteins, histones, c-Fos, HIV-Rev and others; Table 3). Some bind sites distinct from that for PY-NLS , suggesting that Kapβ2 may be as versatile as Importinβ in recognizing multiple classes of NLSs. Structures of these complexes will be necessary to reveal additional cargo binding sites of Kapβ2.
Importin-4 (also known as RanBP4) is a relatively uncharacterized Kapβ. So far, only four cargos, vitamin D receptor (VDR), transition protein 2 (TP2), HIF1-α and ribosomal protein rPS3a, have been reported (Table 4). Like many nuclear proteins, all four Importin-4 cargos are basic proteins. The Importin-4•VDR interaction may involve the cargo’s N-terminal helical and dimeric DNA binding domain whereas the Importin-4•TP2 interaction involves a short basic NLS (87GKVSKRKAV95) [51,52]. Regions of HIF1-α and rPS3a that bind Importin-4 have not been determined and both cargos are also imported by other Kapβs [53,54].
Importin-4 is most similar to yeast Kap123p [6,7], which is the major importer for many ribosomal proteins and histones H3 and H4 (Table 5) [55–57]. Kap123p and Kap121p show significant functional redundancy. Kap123p is a secondary import factor for many Kap121p cargos and similarly many cargos that are primarily imported by Kap123p use Kap121p as a backup importer. Therefore, it is not surprising that Kap-binding regions of both sets of cargos share similar characteristics. The Kap121p/Kap123p recognition motifs are enriched in basic amino acids especially lysines but are longer (> 25 amino acids), more complex and more diverse compared to the lysine-rich classical-NLSs . Despite numerous mapped NLSs, no clear consensus sequence is available due to the diversity of NLS sequences and lengths. Mutagenic data and an arginine-rich NLS in cargo Nop1p suggested that basic rather than lysine residues are important determinants for Kap121p binding [57–61]. Secondary structure and disorder predictions [62,63] suggest that most Kap123p/Kap121p NLSs are either structurally disordered or contain single helical elements, suggesting that they are linear or quasi-linear signals (Y. Chook, unpublished data). Despite overlap with the Kap123p pathway, the current repertoire of Kap121p cargos is larger and includes several cargos that appear mostly Kap121p-specific (Table 5). Interestingly, Kap121p is an essential protein whereas Kap123p is not. A study of nuclear import rates in single yeast cells led to the proposal that Kap123p rapidly imports abundant cargos, while Kap121p specializes in more regulated cargos .
Importin-5 (also known as Kapβ3 or RanBP5) is the Kap121p homolog in humans . It complements a Kap121p and Kap123p double mutant, thus suggesting functional similarity to the yeast Kapβs . Like its yeast counterparts, Importin-5 also imports ribosomal proteins and core histones [50,66–68]. Approximately 20 different import cargos have been identified for Importin-5 but most of these cargos are also imported by several other Kapβs (Table 4). Cargos that are potentially unique to Importin-5 are transcription regulator p60TRP, recombinase Rag-2, apolipoprotein A-I, PGC7/Stella and the PB1-PA dimer of the influenza A RNA polymerase [69–74]. Nuclear localization studies of the first three cargos in cells have been not done even though they bind Importin-5. In contrast, knockdown or inactivation studies suggest that Importin-5 is likely to be the primary importer for PGC7/Stella and influenza PB1•PA. Eight cargos have been mapped for their Importin-5 binding sites (Table 4). With the exception of the helical interaction site of apolipoprotein A-I , most NLSs map to highly basic (lysine-rich) unstructured regions (Y. Chook, unpublished observations).
Importin-7 can mediate nuclear import either as a monomer or as an Importin-β•Importin-7 heterodimer (Table 6). Many Importin-7 cargos are also imported by other Kapβs. Importin-7 and several other Kaps (Importin-5, -9, -β and Kapβ2) share the role of importing abundant ribosomal proteins and histones [50,54,67,68]. Other shared cellular cargos include the p35 neuronal CDK5 activator, HIF1-α, c-Jun and the glucocorticoid receptor [53,76–78]. Importin-7 and/or the Importin-β•Importin-7 heterodimer also participate in importing viral components into the cell nucleus. The HIV-1 Rev protein and the adenovirus core protein pVII are imported by multiple Kapβs [79,80]. Several groups have reported nuclear import of the HIV-1 integrase (HIV IN) by Importin-7, Importin-β•Importin-7 heterodimer, Importin-β, Trn-SR2 or Importin-α•β [81–87] but it remains controversial as to which of these karyopherins or if all of them are responsible for nuclear entry of the integrase [88–90].
The Importin-β•Importin-7 heterodimer is responsible for targeting the linker histone H1 to the nucleus [54,91]. H1 binds several Kapβs but in vitro nuclear import is achieved only in the presence of both Importin-β and Importin-7. Thermodynamic analysis of H1 import by Ficner and colleagues identified H1 binding sites on both karyopherins and revealed positive cooperativity in binding H1, suggesting specific recognition by the heterodimer . When binding or import of a cargo is enhanced in the presence of both Importin-β and Importin-7, transport is likely to be mediated by the heterodimer rather than the individual karyopherins. Similarly, lack of enhancement by Importin-β would suggest that import is mediated by Importin-7 or Importin-β alone. The proline-rich homeodomain (PRH or Hex), zinc finger protein EZI and ERK-2 kinase all seem to be primarily imported by Importin-7 alone (Table 6).
Binding regions of several cargos for either Importin-7 or the Importinβ•Importin-7 heterodimer have been mapped (Table 6). Nuclear import of H1 seems to involve the entire histone protein, suggesting recognition of conformational epitope(s) . Similarly, involvement of conformational epitopes is evident as Importin-7 binds the PRH homeodomain (PDB 1WQI), several zinc fingers of EZI , the PAS domain of HIF1-α  and the leucine-zipper domain of c-Jun [77,94] (Table 6). Importin-7 may also recognize linear recognition epitopes. The NLS in ERK-2 is a 19-residue segment in the SPS or kinase insert domain that includes phosphorylated serines . Phosphorylation may release this surface insert from the kinase domain for Importin-7 binding. The NLS in ribosomal protein rPL23a is quite basic and likely to be flexible . Importin-7 seems to bind recognition motifs that are structurally very diverse.
Importin-7 and Importin-8 are quite similar (68% sequence identity) compared to the low similarity (10–20% sequence identity) seen in other karyopherin pairs. Surprisingly, they are functionally redundant in only a few cases, perhaps because Importin-8 is a relatively unexplored Kapβ. Importin-7 and Importin-8 can both import Smad3 and Smad4 [95,96]. Smad3 is primarily imported by Importin-7 and Smad4 is primarily imported by Importin-8. Several potential cargos were found when Importin-8 was identified as a binding partner in pull down assays. Importin-8 appears to import Argonaute proteins and SRP19 whereas nuclear import of binding partner leukemia fusion protein NPM-ALK has not been studied. NLSs for Importin-8 have not been determined [97–99].
Yeast Kap108p (Sxm1p) and Kap119p (Nmd5p) are most similar to Importin-7 and Importin-8 (Table 7) . Kap108p has a minor role in importing ribosomal proteins, behind Kap123p and Kap121p (section 4 and Table 5) [100,101]. It is also a major import pathway for tRNA maturation factor Lhp1p and poly(A)-binding protein Pab1p [101–103]. NLSs in both these cargos map to regions enriched in basic amino acids that are predicted to be structured (Y.Chook, unpublished observation).
Kap119p has six known cargos: three transcription factors, one kinase, a heat shock protein and a ribosome maturation factor (Table 7 and references within). Basic amino acids in the flexible Crz1p NLS are important for Kap119p binding whereas a hydrophobic segment in the folded N-terminal region of Ssa4p is important. The NLS of Gal4p is located within its N-terminal DNA binding and leucine-zipper domains [104,105]. The Kap119p recognition motifs in its cargos show striking sequence diversity and thus a lack of consensus sequence, as seen in many other Kapβ systems.
Importin-9 is most similar to yeast Kap114p [6,7] but distinct from RanBP9, which is a non-karyopherin Ran associated protein also known as RanBPM. Currently known Importin-9 cargos include core histones, numerous ribosomal proteins and several unrelated proteins such as the A subunit of protein phosphatase 2A (PP2A), c-Jun, aristaless (Arx) and the hepatocellular carcinoma-associated protein (Table 8 and references within). NLSs in Arx, PP2A, c-Jun and rPS7 have been mapped but all appear very different except for their common positively charged character [54,77,106,107]. These targeting segments range from folded homeo-, HEAT repeat and leucine-zipper domains of Arx, PP2A and c-Jun, respectively, to the very basic and flexible 23-residue segment in rPS7 (Y. Chook, unpublished observations).
Like Importin-9, Kap114p mediates nuclear import of core histones H2A and H2B [108,109]. Kap114p is also the primary importer for histone chaperone Nap1p and transcription factors TBP (TATA-binding protein) and TFIIB (Table 8) [110–114]. In addition, Kap114p plays a secondary role, along with other Kapβs, in import of alcohol-responsive Ring/PHD finger protein Asr1p and ribosome maturation factor Rpf1p (Tables 7, ,88 and and9)9) [115,116]. Similar to Importin-9, Kap114p also appears to recognize divergent targeting elements. NLSs in the histones and Asr1p may be conserved quasi-linear helical elements whereas those in Nap1p, TBP and TFIIB are conformational or surface patches on the α/β domain of Nap1p , the saddle-shaped α/β TBP  and the C-terminal helical region of TFIIB  (Y. Chook, unpublished observations).
Few cargos have been identified for Importin-11, its yeast homolog Kap120p and another yeast karyopherin Kap122p (Table 9). Importin-11 is responsible for nuclear import of Ubiquitin (Ub)-charged Class III E2 Ub conjugating enzymes UbcM2, UbcH6 and UBE2E2 . However, the inability of Importin-11 to bind these activated E2s directly raises the possibility that the enzymes may be transported as larger multiprotein complexes and thus the identities of the E2 NLSs remain a mystery. Other than the E2s, Importin-11 also imports ribosomal protein rpL12 . Unlike many ribosomal proteins, which use multiple Kapβs for entry into the nucleus, rpL12 is primarily imported by Importin-11 as UbcM2 inhibits its nuclear accumulation. rpL12 is a very basic protein but its binding site for Importin-11 has not been mapped. The most recently identified Importin-11 cargo is the Gag polyprotein of Rous Sarcoma Virus (RSV) . At least two Kapβs are responsible for Gag nuclear import: 1) Importin-11 through direct binding to its matrix or MA domain and 2) Importinα/β binding to the nucleocapsid or NC domain. The yeast homolog of Importin-11, Kap120p, also imports Gag suggesting functional conservation of Importin-11/Kap120p in the divergent eukaryotes . The NLS that binds Importin-11 is located in the N-terminal half of the MA domain, which is a compact 5-helix domain with a large positively charged surface patch [122,124]. Therefore, at least one Importin-11 cargo is recognized through a three-dimensional conformational NLS.
Four yeast cargos are known for Importin-11 homolog Kap120p (Table 9) . First, Kap120p binds and imports the Ho endonuclease through a 13-residue basic NLS in its C-terminal region that is distinct from a second NLS, which is recognized by Kap121p and Kap123p . Another Kap120p cargo Rpf1p is mislocalized only when all three Kap120p, Kap119p and Kap114p are mutated . Kap binding regions in Rpf1p have not been determined but since it is a very basic protein, positive charges may be important for recognition by all three karyopherins. The third Kap120p cargo is the elongator subunit Tot1p, which also carries a basic stretch that is important for binding the karyopherin . Lastly, transcription regulator Swi6p contains a Kap120p recognition site that resides within a mixed hydrophobic-basic extended portion of its ankyrin-associated domain [127,128]. Importin-11/Kap120p appears to recognize diverse motifs that range from a short linear epitope in Swi6p to the small helical domain in Gag.
Kap122p (or Pdr6p) has no striking homology to any of the known human import-Kapβs but belong to the very diverse TNPO3 subfamily (Figure 1) [6,7]. Its two known cargos are the Toa1p•Toa2p transcription factor complex and the Rnr2•Rnr4 ribonuclease reductase small subunit complex (Table 9) [129,130]. The latter complex contains the WD40 repeat protein Wtm1p, which binds the karyopherin directly. NLSs in the Toa proteins and Wtm1p have not been mapped.
The majority of known cargos for Transportin-SR (Trn-SR) and its splice variant Transportin-SR2 (Trn-SR2; also known as Transportin-3 or Trn-3 or TNPO3) are splicing regulators (Table 10). Many of these splicing proteins are SR proteins, which contain one or two RNA binding domains (RBDs) and a C-terminal RS domain of at least 50 residues that is composed of many arginine-serine dipeptide repeats (RS content >40%). Trn-SRs bind human SR proteins ASF/SF2, SC35, TRA2-alpha, TRA2-beta and drosophila splicing factors 9G8, Rbp1 and RSF1 through their RS domains [131–135]. Interestingly, Trn-SR and Trn-SR2 seem to differentially recognize and import phosphorylated and unphosphorylated SR proteins. X. Fu and colleagues showed that both Trn-SRs bind and import only phosphorylated ASF/SF2 and Trn-Sr2 also imports TRA2-beta in a phosphorylation-dependent manner . In contrast, neither of the Trn-SRs distinguishes between phosphorylated and unphosphorylated TRA2-alpha, and Trn-SR1 also imports TRA2-beta regardless of its phosphorylation state. This work suggests that phosphorylation is not always required for the recognition of RS domains by the Trn-SRs.
It is unclear if RS domains adopt three-dimensional structures on their own. The crystal structure of ASF/SF2 bound to SRPK1 kinase showed that the first four RS dipeptides of ASF/SF2 are in an extended conformation when bound to the kinase . This finding is consistent with the prediction that RS domains are structurally disordered . In contrast, molecular dynamics simulations predicted that unphosphorylated RS repeats are helical whereas the phosphorylated repeats vary from a helical to a wavy extended conformation to a compact combined helical-strand structure . Its low complexity content suggests that the RS domain or repeats may be the third class of linear NLS, behind the classical- and PY-NLSs although it is not known if all RS domains can function as NLSs. Given the contrasting suggestions for RS conformations, structures of phosphorylated and unphosphorylated RS repeats bound to Trn-SR and Trn-SR2 will be important to understand physical characteristics of these NLSs that allow them to be recognized for nuclear import.
The Trn-SR homolog in S. cerevisiae, Kap111p (also known as Mtr10p), also recognizes and imports RS-domain proteins (Table 10) . RNA binding proteins Gbp2p and Hrb1p bind Kap111p through their N-terminal RS-rich domains [139,140]. Poly(A)RNA binding protein Npl3p interacts with Kap111p through its C-terminal RGG-rich region, which also contains several serine-arginine dipeptides [141,142].
In addition to RS domain containing SR proteins, Trn-SR2 also imports non-RS containing splicing regulator RBM4 via its C-terminal alanine-rich or CAD domain (Table 10) . RBM4 competes with RS domains for nuclear import by Trn-SR2, suggesting that binding sites of the two motifs overlap. Comparative structural studies of Trn-Sr2 bound to RS and CAD domains will be necessary to understand how the karyopherin recognizes such diverse motifs.
An even more diverse Trn-SR2 cargo is the HIV-1 preintegration complex (PIC) (Table 10). Trn-SR2 was identified in an siRNA screen as a host protein that is required for HIV-1 infection . It functions after reverse transcription but before integration thereby suggesting that it may mediate nuclear import of the PIC. Trn-SR2 was also shown to bind PIC component HIV IN and import PIC into the nucleus by yeast two hybrid, siRNA studies and binding assays with recombinant proteins . Trn-SR2 seems to bind the integrase through its 160-residue α/β core catalytic domain, suggesting that the NLS here is a conformational one. Interestingly, Importin-7 or the Importinβ•Importin-7 heterodimer was also reported to import HIV IN into the nucleus [81,82,88–90] (see section 5).
Yeast Kap111p also mirrors its human homolog in recognizing diverse cargos. In addition to RS domains, Kap111p mediates retrograde tRNAs import in yeast (Table 10) . It is not known if Kap111p binds tRNAs directly or uses an adaptor in this import pathway.
Importin-13 and Exportin-4 are bidirectional nuclear transport factors. Although Importin-13 is primarily an importer with more than 10 known import cargos (Table 11), it also mediates nuclear export of translation initiation factor eIF1A . On the other hand, Exportin-4 was first identified as an export-Kapβ that mediates nuclear export of translation initiation factor eIF-5A and transcriptional regulator Smad3 . More recently, Exportin-4 was found to also mediate the nuclear import of Sox family transcription factors Sox-2 and SRY .
Importin-13 recognition sites in several cargos have been determined and they all seem to be folded domains. Pax6 (PDB 2CUE) and Arx both bind through their helical homeodomains [106,147]. Heterodimerization of histone-fold pairs NF-YB•NF-YC (transcription factors), CHRAC-15•CHRAC-17 (component of chromatin accessibility complex), CHRAC-12•CHRAC-17 (component of DNA polymerase ε) and NC2α•NC2β (transcriptional regulator) are necessary for Importin-13 recognition [148–150]. Almost the entire compact heterodimer of two α/β proteins Mago•Y14 is engulfed inside the Importin-13 ring (Figure 3) . It is striking that so many Importin-13 cargos bind through their folded domains. The repertoires of these conformational NLSs range from small 3-helical bundle homeodomains to larger heterodimers of helical histone-fold domains to heterodimers of α/β domains.
Exportin-4 binds both transcription factors Sox-2 and SRY through their 3-helix HMG-box domains . Nuclear import of Sox-2 is also mediated by Importin-9 and the Importinβ•Importin-7 heterodimer, which also recognizes binding determinants in the HMG-box domain. Sox-2 is also imported via the classical Importin-α•β pathway.
Nuclear import inhibitors have been reported for the Kapβ2, Importin-β and Importin-5 pathways. Cansizoglu et al. discovered asymmetric binding hotspots in the N-terminal hydrophobic motif hPY-NLS of hnRNP A1 and the C-terminal PY motif bPY-NLS of hnRNP M . They made use of the multivalent nature of the PY-NLS, the avidity effect and combined hotspots from the two different NLSs to design a Kapβ2-specific inhibitor. The resulting chimeric peptide named M9M is a ‘super-NLS’ that binds Kapβ2 so tightly (KD ~ 100 pM) that it competes successfully with natural PY-NLSs and RanGTP (KDs with PY-NLSs and Ran in the low nanomolar range) . M9M is useful to inactivate the Kapβ2 pathway in cells to validate Kapβ2 cargos and potentially to study non-import functions of Kapβ2 such as in nuclear envelope assembly, mitosis  and intra-cilliary transport . The M9M inhibitor works because it out competes natural ligands and Ran cannot displace it . This inhibition mechanism lends support to the concept that a Kapβ-NLS interaction should occur within a range of affinity suitable for both binding and release. M9M has provided a high affinity limit of ~ 100 pM for the Kapβ2 range .
The concept of optimizing NLS affinity also led to development of inhibitors for the classical Importin-α•β system . Through systematic mutational analysis of a bipartite classical-NLS template, Kosugi et al generated an activity profile where each residue is scored for its contribution to the peptide’s nuclear import activity . Two of their peptides named Bimax1 and Bimax2, designed to have the highest scoring residues in each position, bound Importin-α so tightly (KDs reported in the femtomolar range ) that they functioned as inhibitors of the classical import pathway.
Small molecule inhibitors against Importin-β pathway were recently reported. A pyrrole compound named Karyostatin1A binds Importin-β and specifically inhibits the classical import pathway at low micromolar concentrations . Small molecule peptidomimetic compounds were developed to inhibit the Importin-α•β pathway, but they were much less potent with IC50 values in the 100 μM range .
Nature, via viruses, has also evolved molecules that could block nuclear import. The L1 major capsid proteins of human papillomavirus (HPV) types 11 and 16 have been reported to inhibit the Kapβ2 and Importin-5 pathways . The NS5A protein of HCV and the HPV-16 E5 protein may also inhibit the latter pathway [65,156]. The HPV L1 proteins bind Kapβ2 and Importin-5 with sufficient affinity that the M9 (PY-NLS) of hnRNP A1 or RanGTP could not compete, and Kapβ2 and Importin-5 mediated nuclear accumulation were inhibited in import assays . Surprisingly, the L1 proteins are import cargos of the Importin-α•β pathway. Binding sites for Kapβ2, Importin-5 and Importin-α have not been determined but classical-NLSs are likely to be present at their C-termini. L1 proteins have no obvious PY-NLSs and it is unclear where the Kapβ2 and Importin-5 binding determinants are located in these highly structured oligomeric proteins [157,158].
It is unclear why the cell needs 11 different nuclear importers. Are their collections of cargos random except for common NLS(s)? Or do individual Kapβs transport cargos with distinct cellular functions and thus could the Kapβs serve as important points of regulation? If there are functional programs for nuclear trafficking, what are they? Comprehensive knowledge of cargos that are imported by each Kapβ will be necessary to address these questions.
Proteomics studies have shown that 40% of all yeast proteins enter the nucleus . By analogy, the number of human proteins that need to enter the nucleus may exceed 10,000. The Importin-α•β system may import half of this traffic. Some of the remaining proteins may be small enough to diffuse into the nucleus and/or enter the nucleus in a piggy-back fashion within multi-protein complexes mediated through the NLS of a heterologous protein. There may also be significant import pathway overlap and redundancy. All this implies that each of the 10 import-Kapβs may recognize up to hundreds of different proteins, but the current Kapβ-repertoire illustrated in Tables 1–11 clearly falls well short of this predicted target. Many more cargos need to be identified for the Kapβs. Some of the cargos in Tables 1–11 were identified through pull-down experiments with Kapβs but many were identified through studies of individual cargos. The former experiments were all performed in the mid 1990s to early 2000s and the latter way of identifying Kapβ cargos is necessarily uncoordinated, slow and inefficient. Proteomics studies of Kapβ-cargo discovery may need to be revisited given improved proteomics approaches and protein identification technology. An alternative approach is in silico searches cargo discovery through searching for NLSs in genomes. This approach will require characterization of NLS classes for all Kapβs.
We can roughly divide all known Kapβ binding determinants in import cargos into two types, linear and conformational NLSs. Linear NLSs are contiguous in sequence and can be represented as sequence patterns. They are likely to bind Kapβs in extended conformation or as isolated helices. In contrast, conformational NLSs are folded domains with surface patches (not contiguous in sequence) that contact the Kapβs. From the survey of Kapβ cargos above, all Kapβ subfamilies seem to recognize both linear and conformational NLSs. Some Kapβs such as Kapβ2, Importin-5, Transpotin-SRs, Kap121p and Kap123p appear partial to linear NLSs whereas others such as Importin13 and Importin-β seem to bind many conformational NLSs. However, these observed trends might change as more cargos are identified.
The majority of the linear NLSs recognized by a particular Kapβ are very diverse and poorly defined in sequence. Therefore, their consensus sequences are not obvious, shared characteristics have not been defined and the individual nuclear targeting segments cannot yet be classified into NLS classes. Like the PY-NLS, the traditional way of describing a linear recognition motif with a strongly restrictive consensus sequence is probably insufficient for most uncharacterized NLSs . These signals will need to be defined by a collection of physical characteristics that describe their interactions with the karyopherin. The multidisciplinary approach used to study the PY-NLS will be applicable to identify analogous signals across the Kapβ family . More generally, many biological recognition processes involve linear recognition motifs with weak and obscure sequence motifs. Chook’s concept of signals as a collection of physical rules rather than specific sequence motifs alone, could also be expanded across organelle systems to decode the numerous obscure targeting signals and recognition motifs in eukaryotic cells [18,43].
The coverage of conventional sequence-based bioinformatics searches is expected to be severely limited due to the observed cargo/NLS diversity of most Kapβs. Identifying correct sequences that will account for most of the very diverse PY-NLS and other uncharacterized diverse NLSs pose an extremely challenging task. The core problem is that complex NLSs are likely to contain multiple binding epitopes and their binding energies are distributed across the epitopes in many different ways . A strong epitope will allow other epitopes to diverge beyond consensus motifs and still maintain sufficient binding affinity and import activity. Thus, a conventional sequence search will require relaxing sequence constraints that will also increase “noise” and result in many false positives. Effective computational methods to identify complex multivalent NLSs will need to combine sequence-based bioinformatics with structural modeling and prediction of binding energies.
Finally, many cargos are imported by multiple Kapβs. Some seem to bind multiple Kapβs within the same nuclear targeting segments and others have multiple distinct NLSs. It is not clear exactly which binding epitopes are recognized by the different Kapβs or if there are common epitopes that are recognized by many Kapβs. Many uncharacterized NLSs may contain at least one basic epitope that bind multiple Kapβs including Kapβ2 and the Importin-α•β heterodimer. It is not understood why some cargos are imported by multiple Kapβs. Are they more abundant in the cell, larger in size or require complex regulation? Answers to these questions may emerge when a comprehensive nucleo-cytoplasmic traffic map is available.
We thank Katharine Ullman, Bostjan Kobe, Gino Cingolani, Beatriz Fontoura, Gurol Süel and Douglass Forbes for information and discussions. This work is funded by the National Institutes of Health (R01-GM069909 and 5-T32-GM008297), Welch Foundation (I-1532), and UT Southwestern Endowed Scholars Program.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.