Characterization of full-length cDNAs of mouse and human COP1s
The previously reported partial MmCOP1
mRNA (GenBank accession number AF151110.1) encodes a protein product that corresponds to the full-length AtCOP1, but without the ATG start codon at its 5' end [20
]. Homology search against the updated public EST database with the 5' end sequence of the available partial MmCOP1
cDNA allowed the identification of human and mouse EST clones containing a predicted starting codon and a partial 5' UTR, and the subsequent construction of full-length MmCOP1
cDNAs (see Experimental procedures for details). The presence of stop codons just upstream of the predicted start codons and their comparable sizes with mRNA on northern blots (data not shown) suggest that they contain full-length coding capacity. The full-length MmCOP1
cDNAs (GenBank accession numbers AF151110.2 and AF508940) encode polypeptides of 773 and 771 amino acids respectively and share 97.4% identity (Fig and data not shown). Compared to AtCOP1, the predicted full-length MmCOP1 and HsCOP1 proteins contain unique N-terminal glycine-serine rich extensions of about 70 amino acids (Fig. ).
Figure 1 The general characteristics of the mammalian COP1 proteins and genes. (A) Predicted amino acid sequence of HsCOP1. The N-terminal glycine-serine rich extension is underlined. The RING finger domain is boxed. The coiled-coil domain is indicated by dotted (more ...)
The mammalian COP1 is expressed in many different tissues and organs in both embryos and adults, based on the available EST information. Mammalian COP1 EST clones have been found in aorta, B-cell, bladder, brain, breast, cervix, colon, germ cell, head/neck, heart, kidney, liver, lung, myotube, nerve, ovary, parathyroid, placenta, prostate, single cell zygote, spleen, testis, T-cell, tonsil, unfertilized egg, uterus, and whole embryo from human, mouse and rat. Northern blot analysis with selected tissue types confirmed this ubiquitous expression pattern (data not shown).
Genomic structures of animal COP1s and the evolutionary implication
Using the predicted HsCOP1 cDNA sequence, BLAST searches against the draft human genome database revealed significant hits on chromosomes 1, 3, 9 and 18. Because only the homologous sequences on chromosome 1 match 100% with the HsCOP1 cDNA, we conclude that the functional human COP1 gene (designated HsCOP1) is located on chromosome 1q24.1 (Fig ), and the homologous sequences from chromosome 3, 9 and 18 probably represent COP1 pseudogenes, as they all seem to encode truncated proteins. The corresponding sequenced BAC clones on this chromosomal location (GenBank accession numbers AL359265, AL162736, AL590723, and AL513329) can be assembled together with the reference of HsCOP1 cDNA sequences (Fig ). The HsCOP1 gene contains 20 exons and the transcribed region is about 263 kb in length (GenBank accession number TPA: BK000438, Fig ). We were also able to identify the full-length genomic sequence of mosquito COP1 (Anopheles gambiae COP1, AgCOP1) and partial genomic sequences of MmCOP1 and fugu COP1 (Fugu rubripes COP1, FrCOP1). The genomic structures of the vertebrate COP1 genes appear to be generally conserved among themselves, but different from mosquito and plant COP1 genes (data not shown).
BLAST searches also have identified COP1 homologues from many other animal species, including zebrafish, frog, chicken, rat, cow, dog, horse, and pig. However, we have so far failed to find COP1 homologues in Drosophila, C. elegans, or yeast, whose whole genome sequences are available. It is interesting that COP1 is not found in Drosophila but in mosquito.
Sequence comparison reveals that the three structural motifs present in Arabidopsis COP1, including the RING finger, the coiled coil and the WD40 domains, exist in COP1 proteins from all other species. The glycine-serine rich N-terminal sequence appears to be unique to vertebrate COP1s, as it is missing in mosquito and plant COP1 proteins. However, it should be pointed out that the amino acids in this region are not well conserved and their functional role in vertebrates is unclear.
To reveal their evolutionary relationship, we generated a phylogenetic tree based on the amino acid sequences of the N-terminal RING finger containing fragments (all corresponding to the first 200 amino acids of HsCOP1) from representative COP1 homologues, as only partial COP1 sequences are available in the public database for some species. Similar phylogenetic trees were generated with or without the vertebrate specific N-terminal extensions, and a tree with the extension included in the analysis is shown in Fig . As expected, plant and animal COP1s form distinct clades, and the homologies between COP1s from different species are in general consistency with evolutionary distances of these species (Fig ).
The association of mammalian COP1 with ubiquitinated proteins in vivo
A number of RING finger proteins have recently been shown to act as E3 ubiquitin ligases [24
]. WD40 domains have also been frequently found in E3 complexes as substrate recruiting domains [25
]. Despite compelling evidence from Arabidopsis implying that COP1 may mediate protein degradation [4
], no direct evidence of COP1 participating in ubiquitination has been reported.
As an initial step to study a possible role of mammalian COP1 in ubiquitination, we transiently expressed FLAG-tagged COP1 (FLAG-COP1) in human cell line 293 together with HA-tagged ubiquitin (HA-Ub), followed by immunoprecipitation with anti-FLAG antibodies. Immunoblot analysis of the immunoprecipitates with anti-HA antibodies detected multiple distinct bands and smear only when FLAG-COP1 and HA-Ub were co-expressed, indicating that at least a subset of COP1 associated proteins are ubiquitinated in human cells (Fig. ).
Figure 2 Association of mammalian COP1 with ubiquitin and ubiquitinated protein in vivo. (A) Mammalian COP1 associates to ubiquitinated proteins in vivo. Lysates from 293 cells transfected with FLAG-tagged MmCOP1 and/or HA-tagged ubiquitin were subjected to immunoprecipitation (more ...)
To further investigate whether COP1 itself undergoes ubiquitination, a His-tagged ubiquitin expression construct (His-Ub) was co-transfected into the 293 cells with FLAG-COP1, followed by His purification under denaturing condition (in 8 M Urea). Under this condition, only those proteins covalently linked to His-Ub would be co-purified. Immunoblot analysis of the precipitates with anti-FLAG antibodies detected a higher molecular weight smear only upon co-expression of FLAG-COP1 and His-Ub, therefore confirming that COP1 is indeed a ubiquitination substrate (Fig. ). Similar data were obtained when a deletion construct FLAG-N280 contains an intact RING finger domain was used (see late in Fig ). Though FLAG-N280 was not as heavily ubiquitinated as the full-length FLAG-COP1, it clearly showed a ladder pattern, in which the size difference between neighboring bands of the ladder was consistent with the unit molecular weight of His-Ub (Fig. ). Therefore, mammalian COP1 not only associates with ubiquitinated proteins in vivo, but also itself a substrate for multi-ubiquitination.
Figure 5 Subcellular localization of MmCOP1 deletion mutant proteins (A) Diagrams of the deletion constructs of MmCOP1 used for the localization studies. ΔN70, ΔRING, N346, N280 and N200 are shown in comparison to the full-length MmCOP1 protein (more ...)
The mammalian COP1 protein is localized to both the nucleus and the cytoplasm
The subcellular localization of AtCOP1 is regulated by light [14
], a feature that is critical for its function [4
]. Interestingly, the localization of mammalian COP1 protein expressed in plant cells can also be regulated by light [20
], implicating a similar mechanism may operate for the nucleocytoplasmic partitioning of COP1 in mammalian cells.
To investigate this possibility, we first set out to examine the subcellular localization of endogenous COP1 in cultured mammalian cells. Polyclonal antibodies against a N-terminal fragment of MmCOP1 (amino acid 71–270) were generated in rabbits and affinity purified. Endogenous COP1 is not detectable by immuno-fluorescence with our anti-COP1 antibodies, probably due to a very low expression level. Instead we took a subcellular fractionation approach to study the localization pattern of endogenous COP1 protein. The purified anti-COP1 antibodies detected a unique band of about 90 kDa from HeLa whole cell lysates, compared to the preimmune serum (Fig ). This 90-kDa band should represent the endogenous COP1 based on the observations that anti-COP1 antibodies harvested from two other rabbits detect the same molecular weight band (Fig ) and RNAi against COP1 can specifically reduce the level of this band (our unpublished data). After subcellular fractionation with HeLa cells, each fraction underwent immunoprecipitation and subsequent immunoblotting with anti-COP1 antibodies. As shown in Fig , the endogenous COP1 is localized predominantly in the nucleus, but small amount may also be present in the cytosol. Within the nucleus, COP1 is present in both the nucleoplasm (NP) and the nuclear envelope (NE) fractions, although COP1 is more enriched in the nucleoplasm (Fig ). Identical samples from all fractions were also probed with antibodies against other control proteins (Fig. ). As expected, Lamin A and C, components of nuclear lamina, were found predominantly in the NE fraction (Fig ). This fraction was also highly enriched in nuclear pore complexes, as evidenced by the enrichment of Nup62. Nup62 was detectable in smaller amounts in the cytosol and NP fractions as well (Fig ), consistent with its cellular distribution described previously [30
]. On the other hand, Hsp110 was found predominantly in the cytosol fraction, while Hsp90 was found both in the cytosol and the NP fractions (Fig ). Two subunits of the COP9 signalosome, CSN2 and CSN6, were located both in the cytosol and the NP fractions, but not in the NE fractions (Fig ). These controls thus validated our fractionation results on COP1 subcellular distribution.
Figure 3 Subcellular localization of the mammalian COP1 (A) Detection of endogenous HsCOP1 protein by immunoprecipitation and immunoblot. Whole cell lysates of HeLa cells were immunoprecipitated with purified anti-COP1 antibodies from rabbit 1 (R1) or preimmune (more ...)
As another line of experiments, we generated expression constructs containing N-terminal GFP- and FLAG-tagged MmCOP1 and HsCOP1. As shown in Fig. , the subcellular localization pattern of GFP-COP1 varies when expressed in COS7 cells. In general, their localization patterns can be categorized into three types. In Type I cells, GFP-COP1 is expressed mostly in the cytoplasm, with some enrichment around the nucleus (Fig , Type I). In Type II cells, GFP-COP1 is expressed in both the cytoplasm and the nucleus, and also on the NE (Fig , Type II). In the third type, GFP-COP1 protein is expressed mostly in the nucleus, both in the NP and on the NE (Fig , Type III). All three types of localization patterns are well represented among the transfected cells; their relative proportions vary in each experiment, but do not seem to correlate with the expression level of GFP-COP1. Similar results were obtained when other cell lines (including HeLa, NIH3T3, and CHO cell lines) were used or immuno-fluorescence studies were carried out with FLAG-COP1 instead of GFP-COP1 (data not shown).
The full-length mammalian COP1 protein can localize to the nuclear envelope
To test if COP1 is tightly bound to the NE, we extracted cells with 1% Triton X-100 before fixation. This treatment was shown to remove non-NE bound proteins from both the cytoplasm and the nucleus [31
]. Indeed after Triton extraction, GFP-COP1 protein was completely eliminated from the cytoplasm and the nucleoplasm (Fig , +Triton). Only NE-bound GFP-COP1 protein was retained, as characterized by a distinct RING-like pattern around the nucleus. On the other hand, GFP control was completely removed from the cells by the same treatment (data not shown), indicating that the retaining of GFP-COP1 on the NE is mediated by COP1 and not by the GFP tag. Similarly, FLAG-COP1 is also remained on the NE after Triton extraction (data not shown). The different localization patterns in different populations of cells imply that mammalian COP1 may partition among the NP, NE, and cytoplasm compartments.
Human COP1 is part of protein complexes of over 700 kDa
Because HsCOP1 is enriched in the nucleoplasm fraction (Fig ), we used this fraction from HeLa cells for a gel filtration analysis to examine possible form(s) of HsCOP1 in human cells. Endogenous HsCOP1 was found to accumulate primarily in large complexes of over 700 kDa, whereas no monomeric or dimeric form of HsCOP1 was detected (Fig ). This finding is distinct from the observation for AtCOP1, which exists mainly as a homodimer [32
]. In addition, the HsCOP1 complexes differ from the HsCOP9 signalosome complexes in size, as suggested by the different peak fractions (Fig ). Finally, transiently expressed FLAG or GFP tagged HsCOP1 or MmCOP1 co-fractionates with endogenous COP1 (data not shown), suggesting the behaviors of these fusion proteins most likely represent the endogenous function of COP1 protein.
Figure 4 Gel filtration analysis of endogenous HsCOP1 protein. The nucleoplasm extract from HeLa cells was fractionated by Superose 6 column. Fractions were concentrated with StrataClean resin and subjected to immunoblot with anti-COP1, anti-CSN2 and anti-CSN8 (more ...)
Both the nuclear and the cytoplasmic localization signals of the mammalian COP1 are present in the N-terminal region
As a first step to locate the signals responsible for the subcellular localization of mammalian COP1, we made a series of deletion constructs of GFP-MmCOP1 (Fig ). As shown in figure and , GFP-ΔN70 (amino acid 71–733), which deleted the vertebrate specific N-terminal extension, also exhibited three general localization patterns in transfected cells, mostly cytoplasmic, in both compartments, or mostly nuclear, similar to full-length GFP-COP1 (Fig ). However, GFP-ΔN70 does not display NE localization pattern, indicating that the first 70 amino acid of glycine-serine rich region may be responsible for targeting MmCOP1 to the NE (Fig and ). Similarly, AtCOP1 does not contain this region and is not located to NE [14
Unlike GFP-ΔN70, GFP-ΔRING (amino acid 226–773) was only found in the cytoplasm in all the cells examined (Fig and ), suggesting that the NLS of MmCOP1 is probably located within amino acid 71 to 226 of MmCOP1. GFP-N346 (amino acid 71–416) contains the RING finger, the coiled-coil domain, and the region corresponding to the AtCOP1 NLS [17
] and is localized exclusively in the nucleus, forming distinctive nuclear speckles (Fig and ). GFP-N280 (amino acid 71–350), which contains the RING finger and the coiled-coil domain, but not the sequence corresponding to the NLS in AtCOP1, is also solely localized in the nucleus (Fig and ). However, unlike GFP-N346, GFP-N280 does not form speckles (Fig ). These results show that the region from amino acid 351 to 416 is not necessary for nuclear localization but contains a signal for speckle formation. It is worth noting that deletion constructs of AtCOP1 covering the corresponding region were found localized predominantly in the cytoplasm, distinct from the mouse GFP-N280 fusion proteins [17
In contrast to GFP-N346 and GFP-N280, GFP-N200 (amino acid 71–270), which contains an intact RING finger and a partial coiled-coil domain, is localized entirely in the cytoplasm (Fig and ). Similar observations were made when other cell lines were used (HeLa, NIH3T3 and CHO cells) or by using FLAG-tagged construct series instead of GFP fusions (data not shown). Taken together, both the major nuclear and cytoplasmic localization signals seem to reside in the N-terminal region of MmCOP1, most likely between amino acid 71 and 350.
Identification of a classic leucine-rich nuclear export signal in mammalian COP1
Because both GFP-N200 and GFP-ΔRING are localized exclusively in the cytoplasm, the cytoplasmic localization signal is probably located within the overlapping region of these two constructs (amino acid 226–270). Interestingly, a sequence (L
, amino acids 237 to 247), located within the N-terminal region of the coiled-coil domain, matches classic leucine-rich nuclear export signals (NES), such as those from Rex and PKI (Fig ) [33
]. In order to test if this sequence is a bona fide
NES, we made mutations at two conserved leucine residues (L234A and L236A) in the GFP-N200 construct (Fig ). The mutant protein GFP-N200NESmut
, in contrast to wild-type GFP-N200, is entirely nuclear localized (Fig ). Furthermore, treatment of wild-type GFP-N200 with CRM1/exportin specific inhibitor Leptomycin B (LMB) also causes the protein to be localized exclusively in the nucleus (Fig ). These results confirmed the identified sequence (L
) within the coiled-coil domain of mammalian COP1 as a likely CRM1/exportin specific NES. Notably, the corresponding site in AtCOP1 is within the mapped cytoplasmic localization sequence [17
]. Moreover, the NLS must also be located within amino acids 71 to 270 of MmCOP1, since NES-mutated or LMB-treated N200 is nuclear localized (Fig ).
Figure 6 Mapping the nuclear export signal of mammalian COP1 protein (A) Diagram of the amino acid sequence of the wild-type COP1 NES, the alignment with the NESs of Rex and PKI. The amino acid replacements in the mutant protein (N200NESmut) are indicated by arrows. (more ...)
Identification of amino acid clusters critical for mammalian COP1 nuclear import and speckle formation
Nuclear import/localization signals (NLS) are usually composed of one or two clusters of positively charged amino acids. To map the nuclear import signal in mammalian COP1, we first identified three clusters of positively charged amino acids within the amino acid 71–270 for mutagenesis. Furthermore, we also chose to mutate a forth positively charged amino acid cluster at amino acid 358 to 360, which corresponds to the Arabidopsis COP1 NLS [17
]. Since GFP-N346 exhibited clear and consistent nuclear localized pattern with speckles most resembling the full-length MmCOP1 nuclear localization pattern, it was used as the starting point for the site-directed mutagenesis studies (Fig ).
Figure 7 Mapping the nuclear localization signal of mammalian COP1 protein (A) Diagrams of the point-mutation and deletion-mutation constructs in GFP-N346 in comparison to the wild-type GFP-N346. The position and the amino acid replacements in each mutant protein (more ...)
Unlike wild-type GFP-N346, GFP-N346NLSmut1 (R113S/K114N) and GFP-N346NLSmut3 (K205T/R206S/K208T) are totally localized to the cytoplasm, still forming speckles (Fig ). GFP-N346NLSmut2 (K197N/K199N/R201S) is still localized to nucleus speckles, similar to wild-type GFP-N346 (Fig ). Comparable results were attained when the same mutations were introduced into GFP-N280 fusion protein (data not shown). Mutations of the fourth site (R358S/K360T) within GFP-N346, GFP-N346NLSmut4, do not change the nuclear localization of GFP-N346, but abolish the speckles (Fig ). Consistent with this, GFP-N280, which does not contain site 4, does not form nuclear speckles (Fig and ). These results demonstrate both sites 1 and 3 are required for nuclear import, while site 4 is important for targeting N346 to the nuclear speckles. However, we do not know the functional significance of the nuclear speckles at this time. Further, we do not know whether the cytoplasmic speckles formed by GFP-N346NLSmut1 and GFP-N346NLSmut3 are in any way related to the nuclear speckles of wild-type GFP-N346.
A novel RING finger bridged bipartite nuclear localization signal is responsible for mammalian COP1 nuclear localization
In a typical bipartite NLS, the distance between the two clusters of positively charged amino acids is usually around 10 amino acids. However, the distance between site 1 and site 3 in GFP-N346 is more than 90 amino acids apart, separated by the RING finger domain (Fig ). To determine whether the RING finger plays a role in mediating nuclear import, we mutated two key zinc-binding cysteine residues at the RING finger domain of GFP-N346 (C158A/C161A, N346RINGmut) to destroy the RING finger structure (Fig ). Indeed, these mutations abolish the nuclear localization of GFP-N346 (Fig ). The mutated proteins are located in cytoplasmic speckles similar to the N346NLSmut1 and N346NLSmut3 (Fig ).
The RING fingers are characterized by four conserved cysteine-cysteine or cysteine-histidine pairs, which bind to two zinc molecules [[35
]; Fig ]. In particular, cysteine pair 1 and pair 3 bind to one zinc, while cysteine-histidine pair 2 and cysteine pair 4 bind to the other zinc, thereby forming a unique cross-brace structure [[36
]; Fig ]. Cysteine-histidine pair 2 and cysteine pair 3 are separated by two amino acids. Although the sequences other than the zinc binding sites are not well conserved, the overall three dimensional structures of all the characterized RING fingers are overwhelmingly similar, due to the conserved spacing between cysteine-histidine pair 2 and cysteine pair 3 and the tight zinc binding ability [37
]. One common feature is that the RING finger structure brings the flanking N and C-peptides close together. So possibly, the role of COP1 RING finger in nuclear import is to function as structural scaffold by bringing the two clusters of positively charged amino acids (site 1 and site 3) within the right spatial proximity to fit into the binding sites of the nuclear import machinery, as illustrated by Fig .
To test this hypothesis, we deleted the RING finger in GFP-N346 (Δ119–197) and physically put the site 1 and site 3 within a distance of 10 amino acids (Figure ). This mutant protein, GFP-N346ΔRING, maintained the same localization pattern as wild-type GFP-N346 protein (Fig ). Therefore, mammalian COP1 appears to contain a novel nuclear import signal, composed of two clusters of positively charged amino acids bridged by a RING finger.