|Home | About | Journals | Submit | Contact Us | Français|
The leucine-rich nuclear export signal (NES) is the only known class of targeting signal that directs macromolecules out of the cell nucleus. NESs are short stretches of 8–15 amino acids with regularly spaced hydrophobic residues that bind the export karyopherin CRM1. NES-containing proteins are involved in numerous cellular and disease processes. We compiled a database named NESdb that contains 221 NES-containing CRM1 cargoes that were manually curated from the published literature. Each NESdb entry is annotated with information about sequence and structure of both the NES and the cargo protein, as well as information about experimental evidence of NES-mapping and CRM1-mediated nuclear export. NESdb will be updated regularly and will serve as an important resource for nuclear export signals. NESdb is freely available to nonprofit organizations at http://prodata.swmed.edu/LRNes.
Dynamic nuclear–cytoplasmic trafficking of macromolecules controls many eukaryotic cellular processes, such as gene expression, signal transduction, cell differentiation, and immune response. The karyopherin-β family of transport factors recognizes targeting signals within cargo proteins for transport in and out of the nucleus. Nuclear localization signals direct proteins into the nucleus, and nuclear export signals (NESs) direct proteins into the cytoplasm (reviewed in Görlich and Kutay, 1999 ; Chook and Blobel, 2001 ; Conti and Izaurralde, 2001 ; Weis, 2003 ; Kutay and Güttinger, 2005 ; Tran et al., 2007 ; Xu et al., 2010 ).
The leucine-rich or classic NES is the only class of nuclear export signal that has been characterized. An NES is 8–15 amino acids long and contains regularly spaced hydrophobic residues. The name leucine-rich NES was coined because the first signals identified in the HIV-1 Rev and PKIα proteins are enriched with leucine residues (Fischer et al., 1994 ; Meyer and Malim, 1994 ; Wen et al., 1995 ). Since then, many more NES-containing proteins have been identified, and mutagenesis and computational analyses have shown the NES sequences to be more diverse and conform to the loose consensus sequence -X2-3--X2-3--X-, where is L, V, I, F, or M and X is any amino acid (Bogerd et al., 1996 ; Henderson and Eleftheriou, 2000 ; Engelsma et al., 2004 ; la Cour et al., 2004 ; Kutay and Güttinger, 2005 ). The NES is recognized by the export karyopherin CRM1, which is also known as exportin 1 (Fornerod et al., 1997 ; Fukuda et al., 1997 ; Neville et al., 1997 ; Ossareh-Nazari et al., 1997 ; Richards et al., 1997 ; Stade et al., 1997 ). Recently published crystal structures of CRM1 bound to several NESs showed that the signals adopt either combined α-helix–loop or all-loop structures that bind in a hydrophobic groove on the convex surface of CRM1 (Dong et al., 2009a , b ; Monecke et al., 2009 ; Güttler et al., 2010 ). Leptomycin B (LMB) inhibits nuclear export by forming a covalent bond with Cys528 of human CRM1, which is located in the NES-binding groove, thus blocking access of the NES to its binding site (Kudo et al., 1999 ; Dong et al., 2009b ; Monecke et al., 2009 ).
NESs have been identified in >300 proteins with diverse functions, such as transcription factors, cell cycle regulators, ribonucleoprotein complexes, translation factors, and viral proteins (Fischer et al., 1994 ; Wen et al., 1995 ; Fridell et al., 1996 ; Ho et al., 2000 ; Murdoch et al., 2002 ; Vissinga et al., 2009 ). Nuclear export of viral proteins by CRM1 is important for replication of many viruses that cause human diseases. Aberrant mislocalization of cellular CRM1 cargoes also interrupts numerous cellular processes, often resulting in diseases. Therefore controlling CRM1–NES interactions might be a potential therapeutic target for many disease conditions such as cancer and viral infections (Bogerd et al., 1995 ; Yi et al., 2002 ; Faustino et al., 2007 ; Noske et al., 2008 ).
A database of 80 NESs named NESbase 1.0 was compiled in 2003 (la Cour et al., 2003 ). More recently, Fu et al. (2011 ) published a list of 70 NES-containing proteins. Here, we present NESdb, an up-to-date and substantially larger NES database with 221 experimentally identified entries. Each entry is annotated with many detailed features related to the sequence, structure, and nuclear export activity of the NESs and cargo proteins. NESdb is a valuable information resource for the biomedical research community to learn about nuclear export signals that have already been identified. Analysis of the sequences and three-dimensional structures of NESs in NESdb and false-positive NESs generated from NESdb revealed some distinguishing features that might be important for the future development of accurate NES prediction algorithms (Xu et al., 2012 ).
NESdb contains 221 entries as of December 2011. Each entry is a protein that contains one or more NESs. All NESs listed in NESdb were experimentally identified and reported in the published literature. Both the PubMed and UniProt databases were searched using keywords “nuclear export signal,” “NES,” and “CRM1” (Jain et al., 2009 ; The UniProt Consortium, 2011 ). The returned literature was examined with the following criteria to identify the existence of an experimentally tested NES: 1) evidence of CRM1-dependent nuclear export, such as binding to CRM1, inhibition by LMB, nuclear retention at nonpermissive temperature in CRM1 temperature-sensitive yeast strains, or competition with other CRM1 cargoes; 2) the presence of a protein segment that matches the traditional NES consensus sequence -X2-3--X2-3--X-, which can target a reporter protein for nuclear export; and 3) the presence of mutations within the tested NES segment that abolished nuclear export of the full-length protein. All proteins in NESdb meet the first criterion, and many meet all three criteria. The collected information is manually entered into the database. NESdb was implemented as a MySQL database. PHP5 was used to connect to the database and dynamically generate HTML pages. Apache Web server hosted on a Linux cluster was used to serve the database.
The NESdb database is freely available for nonprofit organizations at http://prodata.swmed.edu/LRNes. At this time, NESdb contains 221 experimentally identified CRM1 cargoes reported in the literature. The published literature is searched on a bimonthly basis and NESdb is updated with every 20 new entries. However, many sequences in the genome, especially those in amphipathic helices, match the NES consensus, thus making accurate NES identification difficult. It is likely that some published studies contain mistakenly identified NESs. As a caution to the research community, we separated the 221 proteins in NESdb into two groups. The first group is named “NESs” and contains experimentally identified NESs with no contradicting experimental evidence. The second group is named “NESs in doubt” and contains proteins that were initially reported as NESs but with doubts on their validity cast by subsequent experiments. Clicking the corresponding link on the main page brings up a list of proteins that belongs to each group. The list can be sorted alphabetically by protein names or numerically by protein ID numbers in NESdb. Users are able to positively or negatively flag specific NES-containing proteins on their individual pages. A tally of flags for each protein is displayed next to its name on the list. An entry with many negative flags will be reevaluated and moved to the “NESs in doubt” category or vice versa. The database is also equipped with a search button, which searches the full name, alternative names, and organism of proteins for the keywords. Clicking on a particular protein will load the individual page for the protein.
Each entry contains 14 features related to the sequence, structure, and nuclear export activity of the NESs and cargo proteins. A sample page for snurportin 1 (SNUPN) is shown in Figure 1. The NES features include the following:
NESdb will contribute to the understanding of how protein function is controlled by intracellular localization and will serve as a useful resource for the development of inhibitors that target CRM1-mediated nuclear export. NESdb may be used to train and test new NES prediction algorithms to increase the reliability and accuracy of identifying vague and diverse NESs in the genome.
We thank Lisa Kinch for insightful suggestions for user interface, Ming Tang for technical assistance with Web server hosting, and Maarten Fornerod for discussion. This work is funded by the National Institutes of Health (F32GM093493 to D.X., R01-GM069909 to Y.M.C., and R01-GM094575 to N.V.G.), the Welch Foundation (I-1532 to Y.M.C. and I-1505 to N.V.G.), the Leukemia and Lymphoma Society Scholar Program (to Y.M.C.), the Cancer Prevention Research Institute of Texas (PR-101496 to Y.M.C.), and the UT Southwestern Endowed Scholars Program (to Y.M.C. and N.V.G.).
This article was published online ahead of print in MBoC in Press (http://www.molbiolcell.org/cgi/doi/10.1091/mbc.E12-01-0045) on July 25, 2012.