|Home | About | Journals | Submit | Contact Us | Français|
Ubiquitin-fold modifier 1 (Ufm1)-specific protease 2 (UfSP2) is a cysteine protease that is responsible for the release of Ufm1 from Ufm1-conjugated cellular proteins, as well as for the generation of mature Ufm1 from its precursor. The 2.6 Å resolution crystal structure of mouse UfSP2 reveals that it is composed of two domains. The C-terminal catalytic domain is similar to UfSP1 with Cys294, Asp418, His420, Tyr282, and a regulatory loop participating in catalysis. The novel N-terminal domain shows a unique structure and plays a role in the recognition of its cellular substrate C20orf116 and thus in the recruitment of UfSP2 to the endoplasmic reticulum, where C20orf116 predominantly localizes. Mutagenesis studies were carried out to provide the structural basis for understanding the loss of catalytic activity observed in a recently identified UfSP2 mutation that is associated with an autosomal dominant form of hip dysplasia.
Ubiquitin-fold modifier 1 (Ufm1) is a recently identified ubiquitin-like protein (UBL)5 (1). It shares several common properties with ubiquitin (Ub) and other UBLs. It is synthesized as an inactive precursor protein composed of 85 residues, with two amino acids following a highly conserved glycine in its C terminus, the exposure of which is required for its subsequent conjugation. The NMR structure of Ufm1 shows a similar tertiary structure to Ub and other UBLs despite the fact that it shares very little sequence identity (2). However, Ufm1 displays different surface features from Ub and UBLs, suggesting that it may recognize different partners. It has been demonstrated that Ufm1 is ligated to a number of proteins in HEK293 cells and mouse tissues via a conjugation mechanism similar to that of Ub and UBLs. Mature Ufm1 is activated by a novel E1-like activating enzyme, Uba5, and then transferred to its cognate E2-like conjugating enzyme Ufc1. Recently, a Ufm1-specific E3 ligase, Ufl1, and its cellular substrate, C20orf116, have also been identified (3). Although the biological function of Ufm1 conjugation has yet to be identified, the fact that both Ufm1 and its conjugating system are conserved in both metazoans and plants suggests potential roles in various multicellular organisms.
Like Ub and UBLs, Ufm1 requires specific proteolytic cleavage to remove two C-terminal residues to become its mature form. Two cysteine proteases of different lengths, UfSP1 and UfSP2, have been identified (4). The longer UfSP2 is present in most, if not all, multicellular organisms, whereas the shorter UfSP1 is not found in plants or nematodes. These proteases are also responsible for the removal of Ufm1 from native intracellular conjugates. Neither protease shares sequence homology with any of the five categorized deubiquitinating enzymes identified thus far nor with any previously known proteases. However, the crystal structure of mouse UfSP1 at 1.7 Å resolution revealed a papain-like fold with a unique active site that is composed of Cys and an Asp-Pro-His conserved box instead of the canonical Cys-His-Asp triad, and this Cys and Asp-Pro-His configuration of the catalytic residues seems to form a new subfamily of the cysteine protease superfamily (5).
A mutation within the human UFSP2 gene has recently been identified in a family with an autosomal dominant form of hip dysplasia, termed Beukes familial hip dysplasia (BFHD; MIM142669) (6), which is characterized by severe premature degenerative osteoarthritis of the hip joint.6 The UFSP2 mutation predicts the substitution of the highly conserved Tyr290 by His in the encoded protein. Sequence alignments indicated that the human UFSP2 Tyr290 is equivalent to Tyr282 in the mouse and also corresponds to the highly conserved Tyr41 of mouse Ufsp1. The crystal structure of mouse UfSP1 suggested that Tyr41 plays a role in oxyanion hole formation. Interestingly, the Y282H substitution in UfSP2 abolished the in vitro Ufm1-processing activity of mouse UfSP2, whereas the corresponding Y41H mutation in mouse UfSP1 reduced but did not abolish the activity.6
Here, we report the crystal structure of mouse UfSP2 at 2.6 Å resolution, which shows a unique protein fold for the N-terminal domain linked to the catalytic domain that is similar to UfSP1. We also show that the novel N-terminal domain plays a role in the interaction with its cellular substrate C20orf116 and thus in the recruitment of UfSP2 to the endoplasmic reticulum, where C20orf116 almost exclusively resides. A comparison of the crystal structures of UfSP1 and UfSP2 coupled with the results from a series of mutagenesis experiments on both UfSP2 and UfSP1 defines the structural requirements for the substrate recognition and catalysis and explains the loss of activity of the UfSP2 mutation associated with BFHD.
The cDNAs for Ufm1 (Swiss Protein Database code P61961) and Ufsp2 (Swiss Protein Database code Q99K23) from mouse were cloned into pET28a (Novagen) to generate N-terminal His-tagged proteins. In the case of Ufsp2, because the expressed protein was cleaved at Lys94 as confirmed by N-terminal amino acid sequencing, we have replaced it with Arg at this position to avoid cleavage. In addition, we added another mutation of Arg128 to Ala to evade cleavage upon standing for crystallization. The resulting vectors were transformed to Escherichia coli BL21(DE3) codon plus RIL (Stratagene) cells. The histidine-tagged proteins were purified initially using nickel affinity resins (GE Healthcare) equilibrated with 20 mm Tris-HCl (pH 8.0), 100 mm NaCl, and 1 mm tris(2-carboxyethyl)phosphine and further by Mono Q and gel filtration on a Superdex 75 26/60 column (GE Healthcare). The purified UfSP2 was concentrated to 10 mg/ml in a buffer containing 20 mm MES (pH 6.5), 100 mm NaCl, and 1 mm DTT. Selenomethionine-substituted UfSP2 was generated as described previously (29).
Initial screening for the crystallization was carried out by using 96-well Intelli plates (Hampton Research), and Hydra II Plus One (MATRIX Technology) robotics system at 295 K yielded micro-crystals, and this was further optimized using the hanging drop methods. Diffraction quality crystals were obtained by mixing equal volumes of 10 mg/ml mouse UfSP2 in 20 mm MES (pH 6.5), 100 mm NaCl, and 1 mm DTT with a reservoir solution containing 0.04 m K2HPO4, 12% (v/v) PEG3350 in 3 days. The crystals of UfSP2 belong to the space group C2, with a = 184.53 Å, b = 56.04 Å, c = 143.27 Å, and α = γ = 90° and β = 128.01°, and it contains two molecules per asymmetric unit, corresponding to a Matthews volume Vm of 2.78 Å3 Da−1. Attempts to crystallize the UfSP2 complex with Ufm1 did not yield crystals large enough to be suitable for high resolution data collection.
The x-ray diffraction data set from the native and selenomethionine crystals were collected at beamline 4A of Pohang Light Source, Pohang, Korea. Crystals were equilibrated in a cryoprotectant buffer containing reservoir buffer plus 30% (v/v) ethylene glycol and then flash-frozen in a cold nitrogen stream at 100 K prior to collection. Data were processed, integrated, and scaled by using HKL2000 program suite (30), and the statistics are summarized in Table 1.
The crystal structure of UfSP2 was determined by the multiple wavelength anomalous diffraction phasing method, because all attempts by molecular replacement using UfSP1 failed. Initially 9 out of 11 possible selenium sites were found, and eventually all selenium sites were refined; the initial phases were calculated using the programs SOLVE (31) and RESOLVE (32). About 54% of the residues were automatically modeled as a polyalanine chain by RESOLVE and further constructed using the molecular modeling program COOT (33). The refinement was then carried out using the CNS and REFMAC (34, 35) to an R-value of 23.8% and an Rfree of 29.8%, and the final model included 6609 protein atoms and 107 water molecules. The final refinement statistics are summarized in Table 1.
Site-directed mutagenesis and loop exchanges on the residues that might be involved in the catalysis were carried out using QuickChange site-directed mutagenesis kit (Stratagene) by following the manufacturer's instructions. Mutants of UfSP2 were produced as N-terminally His-tagged proteins with single point mutations at positions Tyr282, Cys294, Asn290, Thr422, and Met283. The chimerical regulatory, upstream, and neighboring loops of UfSP2 were made by substitution with corresponding UfSP1 residues. The regulatory loop of UfSP2 (393GGVLA397) was replaced by 149GDADAQS155 of UfSP1 and vice versa. Upstream and neighboring loop of UfSP2 basically having the quadruple mutant (Y282H/M283G/N290R/T422W) was exchanged. The 284QDRI287 and 423GAEDL427 of UfSP2 changed to 43CDGL46 and 180GTPKNR185 of UfSP1, respectively. In vitro proteolysis assay was performed using GST-Ufm1-HA as a model substrate of Ufm1 precursor as described previously (7). All proteolysis assays were performed by incubating appropriate amounts of UfSP enzymes with 6 μg of GST-Ufm1-HA at 37 °C, and the reaction was stopped by addition of SDS sampling buffer and analyzed using SDS-PAGE. The gels were then stained with Coomassie Brilliant Blue R-250.
HeLa cells were grown on coverslips and transfected with appropriate vectors. Two days after transfection, they were fixed by incubation for 10 min with 3.7% paraformaldehyde in PBS. Cells were washed three times with PBS containing 0.1% Triton X-100, permeabilized with 0.5% Triton X-100 in PBS for 5 min, and treated with 3% BSA in PBS for 1 h. They were then incubated for 1 h with appropriate antibodies. After washing with PBS containing 0.1% Triton X-100, cells were incubated for 1 h with FITC- or TRITC-conjugated secondary antibody in PBS containing 3% BSA. After washing, cells were observed using a confocal laser scanning microscope (LSM510; Carl Zeiss, Jena, Germany). Images were acquired using an 80× objective and then processed using Photoshop (Adobe Systems, Mountain View, CA).
For immunoprecipitation, cell lysates were prepared in 50 mm Tris-HCl (pH 7.4) containing 150 mm NaCl, 1 mm EDTA, 1 mm NEM, 1 mm sodium vanadate, 1 mm NaF, 1 mm PMSF, and 1× protease inhibitor mixture (Roche Applied Science). Cell lysates were incubated with appropriate antibodies for 1 h at 4 °C and then with protein A-conjugated agarose for the next 1 h.
UfSP2 from mouse, containing 461 amino acids, was crystallized, and the structure was determined using multiple wavelength anomalous dispersion data collected from the selenomethionine-substituted UfSP2 and refined at 2.6 Å resolution. Crystallized UfSP2 was mutated at three positions: C294S, K94R, and R128A to void cleavage during expression and crystallization, and all atoms were well defined in the electron density map except for the residues 53–55, 62–64, 81–102, and 117–133. Table 1 summarizes statistics on the crystallographic data. The overall structure has dimensions of 80 × 50 × 55 Å and consists of two domains, with the first domain composed of the 240 residues at the N terminus and the second domain consisting of the 200 residues at the C terminus as seen in Fig. 1A. The two domains are connected by a linker of about 20 residues, and the C-terminal tail forms additional interactions with the N terminus at the interface. The N-terminal domain, which is shaped like a rectangular box of 40 × 40 × 20 Å, has a six-stranded β-sheet with five helices. The strands are in the order of β2-β3-β1-β4-β5-β6, and the two long helices, α1 and α3, are on the somewhat concave face of the sheet, running diagonal to it, whereas the third helix, α4, packs at one end of the β-sheet (β5-β6) in the same direction as the sheet. The inner surface of the β-sheet facing the α-helices is highly hydrophobic, whereas the opposite side of the β-sheet is somewhat polar. The two long helices are amphipathic with the hydrophobic surfaces facing the β-sheet. The linker residues are practically packed against the N terminus of the β-sheet on the opposite side of the two parallel helices. The catalytic C-terminal domain has seven α-helices and seven β-strands and resembles the papain-like structure that was previously reported in mouse UfSP1 (7), and the catalytic triad is positioned on the surface cleft of the opposite side of the N-terminal domain. In this crystal form, there are two UfSP2 molecules per asymmetric unit and the two are related by a noncrystallographic 2-fold and show a root mean square deviation of 1.6 Å (supplemental Fig. S1).
As expected, comparison of UfSP2 with other structures in the Protein Data Bank using the DALI algorithm (8) yielded UfSP1 as the most significant match. The Z-score was 27.2. The next highest matches were Atg4B (PDB code 2CY7 (9) and 2D1I (10)) with a Z-score of 12.7 and murine cytomegalovirus protease M48USP (PDB code 2J7Q) (11) with a score of 7.6. Atg4B is an essential enzyme in autophagy that cleaves nascent Atg8 at its C-terminal arginine residue and deconjugates Atg8 family proteins from a small adduct, phosphatidylethanolamine (12). Other deubiquitinating enzymes such as USP14 (PDB code 2AYN) (13) and OTU1(PDB code 3C0R) (14) had much lower Z-scores, 4.6 and 3.6, respectively. When the N-terminal domain of UfSP2 alone was tested, there were no significant hits on structural similarity search using either DALI or TM-align (15). The highest similarities were found in a putative lipoprotein B and major histocompatibility complex (MHC) class I molecules but with Z-scores less than 5.
The catalytic domain of UfSP2 shows almost identical overall structure with UfSP1, as expected from the sequence identity of 36%. However, there are significant differences as indicated by the root mean square deviation of 1.9 Å for 210 Cα atoms. Some regions show a root mean square deviation greater than 3 Å, but they are mostly on the surface and more than 15 Å away from the active site. One surprising point is that UfSP2 has more prominent secondary structures than UfSP1 (Fig. 1B and supplemental Fig. S2). The helix α7, which harbors the catalytic cysteine residue, is longer in UfSP2, and there is a three-residue insertion after the helix. The loop between α7 and α8 interacts with its α5 helix, which turns 180° in comparison with the loop of UfSP1. A stretch of residues after β8 reorganizes into a longer helix (α11) and is coupled with the changes in residues between β12 and β13. The C terminus of UfSP2 has three extra residues, and they make contact with the N-terminal domain, e.g. the backbone amide of Ala460 forms hydrogen bonds with the carboxyl group of Asp231, and the C terminus of Leu461 forms hydrogen bonds to Arg219.
As seen in Fig. 2A, all the atoms near the active site are well defined in the electron density map. The catalytic Cys294 is located on the N terminus of an α-helix utilizing dipole moment, and “Asp418–Pro419–His420” are located at the loop off a β-strand (Fig. 2B). This is the same as what is found in UfSP1 (5, 7). This differs from the canonical triad of the cysteine proteases where the Asp and His are located at two separate β-strands of the central β-sheet (16). His398 of UfSP2, which is part of a highly conserved stretch among UfSP2 (supplemental Fig. S3), is the canonical histidine position, thereby posing doubt on the identification of catalytic residues. However, when this residue was replaced by alanine, the in vitro activity of the mutant was the same as that of wild type UfSP2 (Fig. 2C). Indeed, in the crystal structure, His398 is too far from the catalytic Cys294, i.e. 7.2 and 3.9 Å away from the Sγ atom of Cys294 and Nδ1 of His420, respectively. Furthermore, superposition of papain shows that Leu396 is located at the canonical histidine position. The loss of activities by mutations of Cys294, Asp418, and His420 further confirm the identification of the catalytic triad, and Tyr282 is responsible for the formation of the oxyanion hole (Fig. 2). The in vitro activity was assessed by using GST-Ufm1-HA as a model substrate for Ufm1 precursor as was used in previous studies (4).
Based on the crystal structure of UfSP1 and the NMR peak shifts in the UfSP1-Ufm1 complex, we predicted that the loop connecting β3 and β4 as well as Trp98 may play a role in Ufm1 recognition and/or stabilization (7). These correspond to the loop connecting β9 and β10 and Trp342 in UfSP2. This loop is referred to as the “R-loop” (regulatory loop) hereafter. On the outset, the R-loop in UfSP2 is slightly shorter than that of UfSP1 (Figs. 1B and and3).3). To test whether this loop indeed participates in Ufm1 recognition, it was mutated. The R-loop of UfSP2 was swapped with that of UfSP1, i.e. the residues 393GGVLA397 in UfSP2 were replaced by the corresponding loop in UfSP1, namely 149GDADAQS155. As shown in Fig. 3C, both the chimera UfSP1 with UfSP2 R-loop (UfSP1-RL2) and wild type UfSP1 digested the substrate completely within 2 h, whereas the chimera UfSP2 with UfSP1 R-loop (UfSP2-RL1) showed limited activity. These results suggest that although this loop is not strictly conserved, it plays a role in the recognition of Ufm1 precursor.
The active site was further dissected to understand the lack of activity observed for the mouse equivalent Ufsp2 mutation associated with BFHD (Ufsp2 Y282H is the mouse equivalent of the human UFSP2 Y290H BFHD-associated mutation). Of note was the finding that UfSP1 Y41H, which corresponds to UfSP2 Y282H, cleaved GST-Ufm1-HA at about a 3-fold lower rate than wild type UfSP1 (Fig. 4C). Because Y41H retained the enzymatic activity (although significantly reduced), the residues that are not conserved within 6 Å of the oxyanion hole Tyr of UfSP2 were examined. These residues included Met283, Asn290, and Thr422 of UfSP2, which correspond to Gly42, Arg49, and Trp179 of UfSP1, respectively (Fig. 4A). To the inactive Y282H UfSP2 mutant, additional mutations were introduced and tested for their in vitro activities against GST-Ufm1-HA to see whether the enzymatic activity gets restored, as was the case in UfSP1. As shown in Fig. 4C, incorporation of Arg at amino acid 290 resulted in a slight recovery of activity. Introduction of an additional mutation at position 422, i.e. Y282H/N290R/T422W led to a further recovery in enzymatic activity. A similar effect was observed in the case of Y282H/M293G/N290R triple mutant. However, incorporation of mutation at all four sites, i.e. Y282H/M283G/N290R/T422W, did not seem to yield additional enhancement. We then tested the possible effect of residues further away from Tyr282, in two adjacent loops that differ from one another as seen in Fig. 4B. These residues correspond to Gln284–Ile287 and Gly423–Leu427 in UfSP2, and Cys43–Leu46 and Gly180–Arg185 stretches in UfSP1 (Fig. 1B), referred to as the “U-loop” (upstream loop) and “N-loop” (neighboring loop), respectively. When the U-loop of UfSP2 was replaced with the corresponding loops of UfSP1 in addition to the quadruple mutant above, it showed a dramatic restoration of enzymatic activity (i.e. the chimera of UfSP2 with its U-loop replaced by that of UfSP1 showed activity that was nearly the same as that of wild type UfSP2) (Fig. 4C). On the other hand, the replacement of the N-loop showed relatively little effect on activity.
In an attempt to determine the role of the unique N-terminal domain of UfSP2, we first examined the subcellular localization of UfSP2 and its N- and C-terminal domains. As shown in Fig. 5, all are localized in both the nucleus and the cytoplasm. Significantly, a portion of the N-terminal domain in the cytoplasm appeared as speckles, raising the possibility that the N-terminal domain plays a role in the localization of UfSP2 in subcellular organelles, such as the ER and the Golgi apparatus (see below).
Recently C20orf116, a protein of unknown function, has been identified as a target for Ufm1 modification; whereas the newly identified Ufl1 serves as a Ufm1 E3 ligase for C20orf116, and UfSP2 deconjugates Ufm1 from its cellular substrate (3). It has also been found that C20orf116 predominantly localizes in the ER. In accordance with this finding, C20orf116 almost completely co-localized with calreticulin, a marker protein for the ER in HeLa cells but minimally with β-COP, a marker protein for Golgi bodies (Fig. 6). Co-expression with C20orf116 revealed that UfSP2 and its N-terminal domain, but not the C-terminal domain, strongly co-localize in the ER. These results suggest that C20orf116 possibly interacts with the N-terminal domain of UfSP2 and recruits it to the ER. To test this possibility, FLAG-C20orf116 was expressed in HeLa cells with and without Myc-tagged UfSP2 and its N- and C-terminal domains. Immunoprecipitation analysis by using an anti-FLAG antibody revealed that full-length UfSP2 and its N-terminal domain, but not its C-terminal domain, co-precipitated with C20orf116 (Fig. 7A), indicating that the N-terminal domain of UfSP2 interacts with C20orf116. To confirm this finding, Myc-C20orf116 was expressed in HeLa cells with and without HisMax-tagged UfSP2 and its N- and C-terminal domains. Pulldown analysis using Ni2+-nitrilotriacetic acid-conjugated agarose showed that the amount of C20orf116 that co-precipitated with UfSP2 or its N-terminal domain was much higher than that with the C-terminal domain of UfSP2 (Fig. 7B). Collectively, these results suggest that the N-terminal domain of UfSP2 plays a key role in the recognition of C20orf116 and thus in the recruitment of UfSP2 to the ER, where C20orf116 is predominantly localized.
Most of the deubiquitinating enzymes that cleave ubiquitin or ubiquitin-like proteins from their precursors or protein conjugates contain not only the catalytic domains necessary for proteolytic activity but also additional N- or C-terminal extensions (17). Some of these extensions include ubiquitin binding domains, ubiquitin-like domains, and others that may participate in protein-protein interactions, but quite often they are not well characterized despite the fact that they may play an important role in modulating substrate specificity, cellular localization, or other physiological functions (18,–20). In the case of UfSPs, unlike UfSP1, which only has a catalytic domain, UfSP2 has an N-terminal extension of about 250 residues that has no homologous proteins other than UfSPs when searched using BLAST. It is worthwhile mentioning that in some species such as Drosophila, rice, and Caenorhabditis elegans, the N-terminal domain is even longer, with about 100 extra residues (supplemental Fig. S3).
The crystal structure of mouse UfSP2 reveals a unique fold that is not found in any presently known cellular protein. The DALI search gave putative lipoproteins as the closest match; however, they are quite different in that these two proteins have a three stranded β-sheet with two α-helices of different lengths stacked next to each other. Again, the heavy chains of MHC class I are somewhat similar, as these have a more extensive β-sheet and two parallel α-helices of similar length. Yet the topology of the domain is quite different, and the groove between the two long helices (α1 and α3) is too narrow to fit anything like a peptide. It is worth mentioning that the sequence identity of the N-terminal domain is much lower than that of the catalytic domain, yet the residues of α4 facing the groove interface, as well as the beginning part of the linker, are conserved among all known forms of UfSP2 (supplemental Fig. S3).
Many deubiquitinating enzymes contain additional domains other than catalytic domains, and these extra domains have been suggested to function in the regulation of subcellular localization, substrate specificity, or physiological function (17, 18). The 2.6 Å resolution crystal structure of mouse UfSP2 reveals that it is composed of two domains that are connected by a 20-residue-long linker. C20orf116 is a cellular substrate of UfSP2 as well as a target protein for Ufm1 modification by the Ufm1 E3 ligase Ufl1 (3). Ectopically expressed C20orf116 interacts with UfSP2 and its N-terminal domain but much less with its C-terminal domain in vivo. These findings strongly suggest that the N-terminal extension of UfSP2 plays a critical role in the recognition of its cellular substrate C20orf116, and thus in the recruitment of UfSP2 to the ER, where C20orf116 predominantly localizes.
The catalytic domain of UfSP2 has a papain-like fold. Mutagenesis of the active site residues shows complete loss of activity for C294A and H420A mutants and some residual activity for the D418A mutant (Fig. 2C). Mutation of His398, which is located near the canonical histidine position, to an alanine did not affect the enzyme activity as expected. These results confirm Cys294, Asp418, and His420 as the bona fide catalytic residues for UfSP2 as was previously suggested based on the structure of UfSP1 (7). Cys294 and His420 are directly involved in the reaction, serving as a nucleophilic attacking group and a general acid-base catalytic element, respectively, whereas Asp418 is essential in stabilizing the transition state through electrostatic interactions. The residual activity seen for the D418A mutant may be due to the aid of His398 residues (Fig. 2). In fact, another crystal form of UfSP2 having one molecule per asymmetric unit shows an additional water molecule between His420 and His398 that could make hydrogen bond linkage between the two (data not shown). It is worth mentioning that when the corresponding aspartate in UfSP1 (Asp175) was mutated to an alanine, it showed a complete loss of activity. Additionally, the position of His398 is occupied by Lys156, and the water molecule is not present in UfSP1.
In addition to the catalytic triad, we suggested that Trp98, and the regulatory loop containing residues 149GDADAQS155 in UfSP1, plays a role in stabilizing Ufm1 binding, i.e. Ufm1 recognition based on the binding study using NMR and modeling. In UfSP2, these correspond to Trp342 and 393GGVLA397 (Figs. 3 and and4).4). Residues around Trp342, which is part of a highly conserved stretch, practically have the same conformation as is found in UfSP1. However, the regulatory loop connecting β9 and β10 in UfSP2 is two residues shorter, and it takes up a somewhat different conformation. In the crystal structures there are differences between the two as shown in Fig. 3. In UfSP1, the loop is held in place via interactions with water molecules as well as the residues connecting α6 helix and β7 strand, whereas the corresponding loop in UfSP2 does not make significant interactions, except the hydrophobic interaction between Trp342 and Val395. Among the UfSP2s these regulatory loops are highly conserved, and the V395A mutant shows reduced activity (data not shown).
To test how important the loop is in Ufm1 recognition, we swapped the regulatory loops between the two, i.e. the loop (residues 393–397) in UfSP2 was replaced by that of UfSP1 (residues 149–155), and vice versa. Although we expected some changes in enzymatic activities, the results were dramatic. In the case of UfSP1, when the loop is replaced with the shorter loop, there was no significant difference in activity; however, the UfSP2 chimera with a longer loop showed decreased enzymatic activity within 2 h (Fig. 3C). Therefore, the length, as well as the composition of this loop, seems to be important in Ufm1 processing. It is also worthwhile mentioning that UfSP1 is more active than UfSP2 in processing Ufm1 (4).
It is interesting to note that in the recently determined crystal structure of Atg4B complexed with LC3, a mammalian ortholog of yeast Atg8, the “regulatory loop,” showed a large conformational change upon LC3 binding (21). In this case, the loop masking the entrance to the active site of free Atg4B is lifted by Phe119 of LC3, whereas overall structures are almost identical. In the case of UfSP2 and HAUSP, in addition to domain-wide conformational changes, reordering of the catalytic site upon Ub binding was reported (22, 23), whereas SENPs appear to require relatively minor local structural rearrangement at the catalytic site in response to the binding of SUMO (24).
During cysteine protease-mediated catalysis, the catalytic cysteine performs a nucleophilic attack on the carbonyl carbon of the scissile peptide bond. The histidine in the catalytic triad facilitates a proton transfer, and the aspartate stabilizes the transition state. In addition to the catalytic triad, another important component of the active site is the oxyanion hole, which is typically provided by glutamine/glutamate or asparagine (25, 26). In the case of both UfSP1 and UfSP2, the oxyanion hole is formed by the backbone amide of catalytic cysteine and tyrosine. The reaction is completed by the attack of a water molecule, which results in the release of free Ufm1 molecules.
The identification of Tyr290 to histidine mutation associated with BFHD led us to consider the role of this residue and others in its vicinity during catalysis. First, when the equivalent Tyr282 in mouse UfSP2 was mutated to histidine, UfSP2 almost completely lost its catalytic activity, whereas the analogous Y41H mutation in UfSP1 retained decreased activity (Fig. 4C). In the crystal structure of UfSP1, the hydroxyl oxygen of Tyr41 makes a tight hydrogen bond to a water molecule (W1221) with a length of 2.8 Å, which in turn makes a hydrogen bond to the amide backbone of Cys53 in UfSP1. The water molecule is 3.2 Å away from the hydroxyl oxygen of Tyr282, between two UfSP2s in the asymmetric unit. When Tyr41 is mutated to histidine, it is most likely that the water molecule is still within reasonable distance of the functional groups of histidine to stabilize the tetrahedral intermediate of Ufm1 as a part of the oxyanion hole in UfSP1, whereas it could be not in the case of UfSP2.
Next, the residues within 6 Å from Tyr282 that are not conserved, such as Met283, Asn290, and Thr422, were mutated to test for their effects on the oxyanion hole. Mutations of Asn290 did not show much effect when replaced by alanine, lysine, or arginine. However, when it was mutated together with Thr422 and/or Met283, the activity of the enzyme was partially restored. When the upstream loop spanning residues 282–289 of UfSP2, namely 282YMQDRIDD289, was replaced by that of UfSP1, namely 41YGCDGLDD48, the enzymatic activity was restored to nearly the same level of wild type UfSP2, in which strictly conserved residues are shown in boldface type. This is the relatively conserved loop connecting the β-strand with the oxyanion hole and the helix with the catalytic cysteine. In UfSP1, Asp47 and Arg49 make two by two hydrogen bonds that are stacked by the side chain of Trp179, shielding the side chain of Tyr41. In UfSP2, Met283 occupies the side chain position of Trp179 of UfSP1, and Trp179 is replaced by Thr422. Arg49 of UfSP1 is replaced by Asn290, so Tyr282 is not as well shielded. In UfSP2, Gln284, which is involved in the upstream loop, forms hydrogen bonds with the carbonyl oxygen and Nϵ2 of His281. Ultimately, these configurations make the loop connected to Tyr282 less flexible and might restrict the movement of the side chain of this residue to adopt the available conformation for an oxyanion hole.
The crystal structure of UfSP2 reveals a two-domain structure with the N-terminal domain having a novel fold connected by a 20-residue-long loop to the catalytic C-terminal domain, which is similar to UfSP1 with Cys294, Asp418, His420, Tyr282, and a regulatory loop, that participates in catalysis. The novel N-terminal domain plays a role in the recognition of its cellular substrate C20orf116 and in the recruitment of UfSP2 to the endoplasmic reticulum, where the substrate predominantly localizes. Of some 80–90 deubiquitinating enzymes, a number of them are linked to physiological disorders, some by mutation, through altered expression levels, and/or as part of regulatory complexes (27), e.g. I93M mutation of ubiquitin C-terminal hydrolase-L1 (UCH-L1), which is a highly abundant neuronal enzyme that is associated with familial Parkinson disease (28). A roughly 50% reduction in catalytic activity resulted in that case. Although the exact mechanism of how UfSP2 inactivity relates to BFHD has yet to be identified, our mutagenesis results provide a structural basis for understanding the loss of catalytic activity that is observed in the UfSP2 mutant that is associated with BFHD.
We thank Drs. K. Joo and S.H. Kang for discussions and the staff at the 4A beamline of Pohang Light Source for help during data collection.
*This work was supported in part by Functional Proteomics Center Grant FPR08B2-280, the 21C Frontier Research and Development Program of the Korea Ministry of Science and Technology, a Korea Institute of Science and Technology Institutional Program grant (to E. E. K.), Korea Research Foundation Grant KRF-2005-084-C00025, and Korea Science and Engineering Foundation Grant M10533010001-05N3301 (to C. H. C.).
The atomic coordinates and structure factors (code 3OQC) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
6C. M. Watson and G. Wallis, manuscript in preparation.
5The abbreviations used are: