PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Struct Biol. Author manuscript; available in PMC 2010 November 1.
Published in final edited form as:
PMCID: PMC2802830
NIHMSID: NIHMS134220

Characterization of the kringle fold and identification of a ubiquitous new class of disulfide rotamers

Abstract

The disulfide-bridged chains in the kringle (K) and fibronectin type II (FN2) domains are characterized using a taxonomy that considers the regularities in both β-secondary structure and cystine cluster. The structural core of the kringle fold comprises an assembly of two β-hairpins (a “β meander”) accommodating two overlapping disulfides; one cystine is incorporated in adjacent β-strands, whereas the other is located just beyond the ends of non-adjacent β-strands. The dispositions of the (N, C) termini of the two overlapping disulfides in the kringle fold are given as (m, j+1) and (i−1, k+1), in which m, i, j, and k (m < i < j < k) are residues fulfilling the relations m ~w j+3 and i ~n j ~w k, where the relationship ~n/w associates residues belonging to a narrow/wide hydrogen-bonded pair of an antiparallel β-sheet. This pattern is the structural signature of the kringle fold and is referred to as the “disulfide kringle-cross”. The metrics of this motif are quantified, revealing structural differences between the two families of the kringle fold. The conformations of disulfides in the kringle fold are poorly accommodated by existing classification schemes. To elucidate the nature of these rotamers we have performed density functional theory (DFT) calculations for diethyl disulfide. A new classification for the disulfide conformations in proteins is proposed, consisting of six rotamer types: spiral, trans–spiral, corner, trans, hook, and staple. Its relation with previous classification schemes is specified. A survey of high-resolution X-ray structures reveals that the disulfide conformations are clustered around the averaged conformations for the six classes. Average conformation dihedral and distance values are in excellent agreement with the DFT values. The two overlapping disulfides in kringle domains adopt the trans-spiral conformation that appears to be ubiquitous (~ 17%) in proteins. One of the disulfides stretches across the β-meander, invoking “strain” in the disulfide conformational state. The relevance of the new classification and the concept of strain are briefly discussed in the context of disulfide bond cleavage in proteins.

Keywords: Cystine, Density functional theory (DFT), Disulfide, Disulfide rotamer classification, Disulfide-rich fold, Fibronectin type 2 domain (FN2), Kringle domain, Neurotrypsin kringle domain, Protein structure comparison

1. Introduction

Since structure is better conserved than sequence, structural similarity between proteins may indicate an evolutionary relationship, even if their sequence homology is low [1-3]. Kringle (K) and fibronectin type 2 (FN2) domains share a sequence identity of 15% or less but are considered to be divergent members of the kringle fold. The K domains are ~ 80 amino acid residue-long proteins with three disulfides. The FN2 domains comprise ~ 40 amino acids residues with two disulfides and have been proposed to have evolved from the K domains through partial deletion of residues from the terminal strands [4, 5]. Since the N- and C-termini of the neurotrypsin kringle (NT/K) are shorter than in other K domains, it seemed attractive to hypothesize that NT/K represents a structural intermediate in the evolutionary transition from K to FN2 domains [5]. This example illustrates the dilemma of what constitutes a meaningful definition of “fold” in the evolutionary context: either an exact recapitulation of the chain path or just a common structural core of the chain?

The kringle fold has been characterized as a three-disulfide, triple-loop folding domain with overlapping inner disulfides in conventional protein structure taxonomy [6, 7]. However, this definition is too restrictive because it excludes the two-disulfide containing proteins of the FN2 family and too permissive because it includes proteins from other folds, for example the periplasmic binding protein-like II fold (e.g. serum transferrin, milk lactoferrin, and avian ovotransferrin). Alternatively, Cheek et al. [8] have used the concept of a common structural core for assigning both families to the kringle fold, which they described as a pattern of two overlapping disulfides and five anti-parallel β-strands, arranged in “two perpendicular layers”. Although this qualitative description provides a family-specific sketch of the β-structure of the FN2 domain, it incorrectly depicts the conserved structural unit of the kringle superfamily. Classification schemes, such as SCOP [9] and CATH [10], describe the kringle fold in different ways. CATH assigns the kringle fold to the “mainly β/β barrel” of the conventional fold classification. SCOP recognizes the class of small disulfide-rich folds (SDF) and groups the K and FN2 families together into the kringle fold under the denominator “disulfide-rich fold; nearly all β”, but does not quantify their structural relationship. Harrison and Sternberg [11] have proposed a rigorous SDF classification based on the regularities in cystine clusters and their relationship with β-secondary structure, that they connect. Using this classification, we present here an in-depth analysis of the structural features of the kringle fold, including the geometries, mutual arrangements, and dispositions of the disulfides relative to the β-structure.

We find that the conformations of the overlapping disulfides are conserved in the kringle fold but that they are poorly described by the disulfide rotamer classes proposed in the protein literature. Previous analyses of the disulfide conformations in proteins employed visual inspection for classification with an emphasis on the side-chain dihedral angles, in particular χ2, χ3 and χ2′ [11-13]. Richardson sorted the disulfides into two major classes: left-handed spirals and right-handed hook [12]. Hutchinson and Thornton loosely grouped disulfides based on the signs of these angles into four categories: left-handed and right-handed spirals, right-handed hook and short right-handed hook [13]. Harrison and Sternberg termed the short right-handed hook as the right-handed staple and identified the right-handed corner conformation [11]. The cystine rotamer classification in proteins presented here differs from previous ones in that there are six categories on the basis of density functional theory calculations for diethyl disulfide. The relevance of the new classification is discussed in the context of disulfide bond cleavage in proteins, in particular the potential role of disulfide rotamer strain in the release of angiostatin from plasmin.

2. Methods

2.1. Conservation scores for residues in multiple sequence alignments

A conservation score was calculated for each column of the multiple alignment within families using Scorecons [14]. The scores range from 0 to 1, in which columns with all identities are assigned the highest value and the others get a score according to the degree of similarity between their residues.

2.2. Density functional theory calculations

Density functional theory calculations were performed using Becke's three parameter hybrid functional (B3LYP) and basis set 6-311G* provided by the Gaussian 03 (release B.05) software package [15]. The geometry optimizations were terminated upon reaching the default convergence criteria.

2.3. Data set construction

A non-redundant data set of disulfide-bridged chains was constructed from the PDB database as of January 14, 2009. The high resolution structures with disulfide bond count ≤ 20 were selected using 30% sequence identity. Only single chain (monomeric) proteins were included in our analysis. This list was further refined using resolution ≤ 2.0 Å and R-factor ≤ 0.30 (198 structure hits).

3. Results and discussion

3.1. Characterization of the kringle fold

3.1.1. Amino acid sequence alignments

Multiple sequence alignments (Figure 1) show that (i) the members of the K family share a sequence identity of ~ 30%; (ii) The members of the FN2 family share a sequence identity of ~ 50%; (iii) The sequence identity between the two families is below 15%. The two inner disulfide bridges cystine b and cystine c occur at identical positions in all sequences, and rank among the five strictly conserved residues in the kringle superfamily (Figure 1). The outer disulfide bridge, cystine a, connecting the residues near the N- and C-termini, is a unique feature within the K family, but is not present in the FN2 family. In NT/K the N- and C-termini are five and three residues shorter, respectively, than in other proteins within the K family. The shortening does not affect the specifically conserved positions in the K family (blue in Figure 1), however. NT/K fits the profile of a true member of an ancient K lineage, having ~ 30% identity in sequence alignment.

Figure 1
Multiple sequence alignments of the kringle superfamily. The conservation scores are calculated for each column of the multiple alignments between and within the FN2 and K families and given in the middle, top and bottom rows, respectively. Positions ...

3.1.2. Kringle topology

In Figure 2 X-ray structures of the plasminogen kringle 4 [6] (PGN/K4) and the second type II module from matrix metalloproteinase 2 [16] (MMP2/2) domains, distinct examples representing the K and FN2 families, have been chosen for pictorial comparison with the NT/K structure recently determined by NMR [17]. To facilitate our understanding of hydrogen bonding pattern and disposition of disulfides with respect to β-secondary structure in kringles, we have projected the primary sequence of NT/K on a plane and aligned hydrogen bonds horizontally such that the residues appear in arrays (Figure 3Aa). Residues that belong to the same array are referred to as being “in register” (for example, residues N10–C18–Y60–D68). The secondary-structure elements have been assigned using the Kabsch & Sander notations [18]. An inspection of structures reveals that the core, consisting of two antiparallel β-bridges (B and C) and β-ladder (D), is shared by members of the K and FN2 families.

Figure 2
Comparison of NT/K with representative structures from the kringle and FN2 families: PGN/K4 (code 1pk4), NT/K (code 2k51), and MMP2/2 (code 1ck7). The structures were superimposed using cystines b and c. β-Strands are shown in orange. The disulfides ...
Figure 3
(A) Schematic representation of the β-secondary structure in NT/K. Hydrogen bonds observed in the structure are marked by dotted lines. Hydrogen bond arrows point from donor to acceptor. Disulfide bonds, labeled SS-a, SS-b and SS-c, are indicated ...

Cystines b and c have a conserved disposition relative to the β-secondary structure that they connect. The N- and C-termini of both cystines b and c are separated by one array of residues (Figure 3A). The N-terminal half-cystine residue Cys18b is in register with conserved residue Tyr60, which is two residues up along the backbone with respect to the C-terminal half-cystine residue Cys58b. The N-terminal half-cystine residue Cys47c is in register with Gly70, which is two residues down along the backbone with respect to the C-terminal half-cystine residue Cys72c (Figure 3A). The two N-terminal half-cystine residues (Cys18b and Cys47c) are separated by the loop 1. The N-terminal half-cystine residue Cys47c is in register with the C-terminal half-cystine residue Cys58b. Residues Cys18b and Cys58b of cystine b are incorporated in adjacent β-strands, whereas residues Cys47c and Cys72c of cystine c are both located just beyond the ends of non-adjacent β-strands (Figure 3A). Cystine b links β-bridge B to the core β-ladder D whereas cystine c crosses all over the β-meander. These relationships can be formalized as follows. If we use the symbol ~n/w to indicate a pair of residues forming a narrow/wide hydrogen-bonded pair of an antiparallel β-sheet, then the dispositions of the (N, C) termini of cystines b and c are given as (m, j+1) and (i−1, k+1), where m, i, j, and k (m < i < j < k) are residues fulfilling the relations m ~w j+3 and i ~n j ~w k (in FN2, only one of the two hydrogen bonds between residues m and j+3 is present, see below). These relationships have been summarized in Figure 3B where arrays of residues that are in-register have been given by dashed lines.

The β-secondary structure of the core is conserved in the kringle superfamily, apart from a minor variation between the K and FN2 families. In the K family structures both hydrogen bonds 19→59 and 61→17 within β-bridge B are conserved whereas in the FN2 module family only one of the two hydrogen bonds (19→59) is present (Figure 3A). The conservation of at least one hydrogen bond ensures the alignment of the linked residues into the arrays of Figure 3B throughout the kringle superfamily. Such close structural correspondence is striking given that the residues involved in hydrogen bonds 19→59 and 61→17 are not conserved between the two families (Figure 1). This observation highlights the importance of the interplay between disulfides and adjacent hydrogen bonding of the backbone in the stabilization of the kringle fold.

The divergence in β-secondary structure between the K and FN2 families is mainly due to a drastic reduction in the length of the loop 1 in the FN2 family, i.e. 12 residues compared to 26–29 residues in the K family (Table S1). Unlike NT/K and PGN/K4, MMP2/2 comprises an antiparallel β-sheet within loop 1 (Figure 2), which is a conserved feature within the FN2 family. An overlapping pattern of a left-handed polyproline II (PPII) helix (residues 17–20, containing a conserved half-cystine at position 18), with β-bridges A and B (11–17 and 18–60, containing conserved residues at positions 18, 59 and 60) is a conserved feature within the K family. In the FN2 family there is a shorter PPII-helix containing residue Cys18, but there is no equivalent of β-bridge A and only one hydrogen bond of B, viz. 19→59 (Figure 3A).

3.1.3. Core geometry

Three-dimensional alignments based on the superposition of the Cα atoms of the residues that participate in the conserved hydrogen bonding of disulfide-bridged β-sheet (Figure 3A) show differences in the core structure (Table S2). This motif contains the residues at positions 18, 19, 47, 48, 56–60, 68–72 (Figure 1); only five of them are strictly conserved (residues 18, 47, 57, 58 and 72) while the other residues of the motif show a wide range of conservancy scores between and within the families. The average RMSDs between the K and FN2 families (0.76 Å) are consistently larger than the averages within the families (0.41 Å). The difference of the averages (0.35 Å) for the inter- and intra-family comparisons represents a structural discrepancy between the K and FN2 families. The same pattern is observed when NT/K is compared with members of the K and FN2 families: NT/K shares consistently lower RMSDs with the K family. [N.B. The RMSDs for comparisons of NT/K (an NMR structure) are ~ 0.1 Å larger than the RMSDs for the comparisons between the other structures (all X-ray structures). However, the difference of 0.34 Å between the average RMSDs for the NT/K–K and the NT/K–FN2 comparisons is about equal to the difference between the average RMSDs within and between the K and FN2 families.]

Comparison of the backbone orientation angles, κ, in the kringle superfamily shows that κ is ~ 55° and ~ 107° for cystines b and c, respectively (Table 1). Hence, the kringle fold conserves the parallel and orthogonal orientations for the flanking backbone segments of cystines b and c, respectively. The polypeptide chain follows a right-screw path from Cys47c to Cys72c via Cys58b.

Table 1
Anglesa between flanking backbones between disulfide bonds of inner cystines in the kringle superfamily

The dihedral angles and distances for the cystine-pair cluster (Table 2), while confined to narrow ranges, exhibit significant family-specific differences in the averages. In the K family cystines b and c meet the two-cystine distance clustering criterion [11] (three DCαCα values ≤ 7.5 Å) for closely clustered cystine pairs with average minimum distance of ~ 5.2 Å between Cys47c and Cys58b and negative cluster dihedrals vbcCα90° and vbcSγ146°, and θbcSγ31°. In the FN2 family the cystine pairs tend to be loosely clustered (two DCαCα values ≤ 7.5 Å) with vbcCα74°, vbcSγ111°, and θbcSγ56°. The K and FN2 families differ consistently in the cluster dihedrals, namely by ~ 16°, 35°, and 25° in vbcCα, vbcSγ, and θbcSγ, respectively (Table 2). In NT/K the dihedral angles and distances for the cystine-pair cluster are within the standard deviation from the average for the K family.

Table 2
Geometry for cystine-pair clustera in the kringle superfamily

3.1.4. The disulfide kringle-cross

The analysis of the kringle fold given in the preceding sections can be summarized as follows: (1) there are two overlapping disulfides; the first half-cystine in the primary sequence (N1) is bridged to the third (C1) and the second (N2) to the fourth (C2). (2) The C-termini of the two disulfide bridges belong to the same β-hairpin and are separated by an array of residues. (3) The N-terminal half-cystine residue N2 is in register with the C-terminal half-cystine residue C1. (4) The backbone exhibits a right-screw path from N2 to C2 via C1. (5) The N1–C1 and N2–C2 cystines connect marginally parallel and nearly orthogonal flanking chain segments, respectively. (6) The two cystines are clustered and have negative values for the cluster dihedrals vbcCα and vbcSγ.

The arrangement of the disulfides in the kringle fold is reminiscent of the disulfide β-cross, which occurs in non-homologous domains with a variety of folds [11]: the two motifs share properties (1), partially (2), (5), and partially (6). However, there are also a number of distinctive differences: the C-terminal half-cystine residues Cys58b and Cys72c are out of register in terms of wide-pair hydrogen-bonding positionb; cystines b and c form a cluster with negative vbcCα and vbcSγ values, unlike the majority of β-crosses, which have positive cluster dihedrals; the handiness of the backbone screw path is opposite to that in β-crosses. The two overlapping disulfides appear as a cross symbol when viewed along the line connecting the midpoints of the bonds (Figure 3C). To distinguish the disulfide pair in the kringle fold from the disulfide β-cross we have coined the term “disulfide kringle-cross”. The comparative analysis shows that the disulfide kringle-cross motif is present in both the K and FN2 families and is the defining feature of the kringle fold.

3.2. Disulfide conformations

3.2.1. Dihedrals of disulfides in kringles and other disulfide-rich proteins

Table 3 lists dihedrals (χ1, χ2, χ3, χ2′, and χ1′) and distances (Cα−Cα and Cβ−Cβ) of disulfides in high-resolution kringle structures. To place these values in the context of the protein data at large, we have compared them with the phenomenological distributions given by Petersen et al. [19]. The following observations can be made: (a) |χ3| ≈ 90° with χ3 > 0 and χ3 < 0 appearing with equal probabilities. Thus, from a general perspective, both the left- and right-handed cystines conformers are admissible in kringle structures. (b) The maxima in the distributions for χ1 and χ1′, which are quite similar, are centered at ~ −60°, ~ 60°, and 180°, with −60° being, by far, the most likely value. The χ1 and χ1′ values for cystine b and the χ1′ for cystine c (~ −60°) are in the preferred domain. The value χ1′ > 0 for cystine c is less frequent and appears to be a special feature conserved throughout the kringle superfamily (Table 3). (c) For χ3 < 0 there is a high probability that both χ2 and χ2′ are ~ −60° and a minor likelihood for finding ~180°. The value χ3 < 0 yields in combination with the preferred values χ2, χ2′ ≈ −60° the most frequent disulfide rotamer, i.e. the left-handed spiral (class ggg, see below).

Table 3
Geometrical parameters of disulfides in the kringle superfamily

The conformations of the two overlapping disulfide bridges, cystine b and cystine c, do not belong to any of the rotamer types described in the protein literature [11-13]. To obtain a better understanding of the disulfide conformations, we have performed density functional theory (DFT) calculations (sections 3.2.2 and 3.2.3) [15] for the model system diethyl disulfide (EtSSEt). These calculations give also insight into the origin of the systematic differences (see Table 3) in the values of the dihedrals and Cα−Cα distances for the two inner disulfides (section 3.2.6).

3.2.2. Rotamers of diethyl disulfide

Here we propose a six-class taxonomy for the disulfides in proteins on the basis of the rotamers for diethyl disulfide. The geometry of diethyl disulfide is characterized by the dihedrals directing the ethylene moieties, χ2 and χ2′, with values of ~ ±60° (gauche, g±), ~ 180° (trans, t) and the central dihedral χ3 ~ ±90° (gauche, g±), cf. Figure 4A. There are 18 combinations for (χ2, χ3, χ2′); the dihedrals χ1 and χ1′ are not considered here because the backbone atoms adjacent to the Cα atoms have been replaced by hydrogens in the diethyl disulfide model. As shown in an elegant study by Görbitz [20], these combinations can be divided into six classes (Figure 4B). Three of these classes contain each 4 combinations representing rotamers of C1 symmetry and the remaining three classes contain each 2 combinations representing rotamers of C2 symmetry (Table 4); for example, class tgg′ = {tg+g, tgg+, g+gt, gg+t}, and class ggg = {g+g+g+, ggg} (N.B. g and g′ indicate dihedrals with opposite signs). Any two rotamers in which the dihedrals appear in reversed orders, e.g., as in the pair tg+g+ and g+g+t, represent identical diethyl disulfide molecules but are distinguishable in the context of the protein environment (with the left-hand symbol referring to N-terminus and the right-hand symbol to C-terminus of the disulfide). Thus, the classes for the diethyl disulfide model contain each 2 rotamers, namely, the left- and right-handed forms (giving a total of 12 rotamers), whereas the classes for the disulfides in proteins contain 4 and 2 rotamers for classes of C1 and C2 symmetry, respectively (giving a total of 18 rotamers). The 18 protein rotamers will be denoted: R-ggg, L-ggg (Class 1 or ggg); R-tgg, L-tgg, R-ggt, L-ggt (Class 2 or tgg); R-ggg′, L-ggg′, R-g′gg, L-g′gg (Class 3 or ggg′); R-tgt, L-tgt (Class 4 or tgt); R-tgg′, L-tgg′, R-g′gt, L-g′gt (Class 5 or tgg′); R-g′gg′, L-g′gg′ (Class 6 or g′gg′).

Figure 4
(A) Dihedrals of diethyl disulfide viewed along their rotation axes. The atom of the view axis in the forefront is indicated by a large circle and the one in the background by a black circular dot. Bonds connected to the atom in the forefront are solid ...
Table 4
List of equilibrium conformations of diethylene disulfide in vacuum calculated with density functional theory a, b, c

The proposed six-class taxonomy for the disulfides in proteins is based on the following definitions for g, g′, and t: in the R (right-handed) form, χ3 [set membership] (0, 180°) and χ2, χ2[set membership] (0, 120°) are denoted g, and χ2, χ2[set membership] (−120°, 0) are denoted g′. In the L form, χ3 [set membership] (−180°, 0) and χ2, χ2[set membership] (−120°, 0) are denoted g, and χ2, χ2[set membership] (0, 120°) are denoted g′. Irrespective of the L- or R-form, χ2, χ2[set membership] (120°, 240°) [or, equivalently, χ2, χ2[set membership] (−240°, −120°)] are denoted t. These definitions imply that any disulfide belongs to one, and only one, of the 18 rotamers and, consequently, to one, and only one, of the 6 rotamer classes.

The geometry of diethyl disulfide has been optimized for one representative from each of the six classes, using density functional theory (see Methods). The relative energies of the classes have been listed in Table 4 in the order of increasing energy. Figure S1 shows the minima for the classes ggg, tgg, and tgt in the potential energy surface of diethyl disulfide as a function of χ2 and χ2′ in the angular range relevant for the kringle fold. Table 4 gives also the values of dihedrals and selected distances in the optimized structures. The listed dihedrals, although deviating from the ideal values given above, confirm the classification scheme. Figure 4B shows the optimized geometries for the six classes. Interestingly, ggg is the conformation of diethyl disulfide with the lowest energy, suggesting that the ubiquity of this rotamer in proteins (see below) derives from an intrinsic property of the disulfide bridge. Class g′gg′ is highest in energy, due to the steric repulsion arising from the short Cα−Cα distance. Class g′gg′ is the only class for which the Cα−Cα distance is shorter than Cβ−Cβ (Table 4). The disulfide conformations span a remarkably broad range of Cα−Cα distances (~ 3 Å) at the expense of only minor energy penalties (Table 4). Given the smallness of the energy differences, these quantities inevitably depend on the choice of the basis set and density functional, affecting the results in details but not in essence. Thus, while it is tempting to consider the disulfides as the principal determinants of the kringle fold, it should not be overlooked that the great flexibility of these entities leaves considerable room for other interactions, e.g. hydrogen bonding, to shape the structure. The rotameric energies show a high degree of additivity with respect to changes in χ2 and χ2′: E(tgt) ≈ 2 E(tgg) and E(tgg′) ≈ E(tgg) + E(ggg′). However, E(g′gg′) > 2 E(ggg′), due to the short Cα−Cα distance in g′gg′(Table 4).

3.2.3. Potential energies for diethyl disulfide

The rotameric energies as a function of Cα−Cα distance for the classes with minima in the 5 – 7 Å range is presented in Figure 5. The lowest system energy stays, throughout the distance domain shown, within a narrow, sub kcal-per-mole energy range, attesting the flexibility of the disulfide bond. Compression of the tgt conformer gives, after passing a narrow transition range around ~ 5.85 Å, the tgg′ conformer (Figure S2). The potential energy curve for ggg crosses the curves for tgg and tgt without changing conformation. The definition ranges for the potential wells in Figure 5 were limited by a lack of convergence in the geometry optimizations outside these ranges. Except for Cα−Cα distances beyond 6.7 Å (Figure 5), the disulfide bridge can exist in several, generally Cα−Cα-strained, conformations. The effect of Cα−Cα strain on the dihedrals for the ggg and tgg conformations is illustrated in Figure 6 (cf. section 3.2.6). Disulfides possess the conformational and structural flexibility needed for insertion into a wide range of dispositions with respect to the hydrogen bonding patterns of protein backbones.

Figure 5
Energies of rotamers of diethyl disulfide versus Cα-Cα distance obtained by relaxed scans of DFT total energies. The dots indicate disulfides b (at 5.9 Å) and c (at 6.5 Å).
Figure 6
Dihedrals for conformational classes ggg (dots) and tgg (solid) of diethyl disulfide versus Cα-Cα distance obtained in scans of Figure 5. χ2′ ≈ χ2 for class ggg.

3.2.4. Statistics for the six-class taxonomy of disulfide conformations in proteins

Table 5 presents the results of a statistical analysis of protein disulfides on the basis of the six-class taxonomy proposed in section 3.2.2. We have analyzed a test set of 453 disulfides in 198 high-resolution X-ray structures, selected with the criteria specified in Methods. The PDB codes of the crystal structures are given in Table S3 of the Supporting Information, together with the dihedrals, Cα−Cα distances, and rotamer type for each disulfide of the test set. The number of incidences, the corresponding percentages of the total number, and the averages of the inner dihedrals and Cα−Cα distances are listed in Table S4 for each of the 18 rotamers. The data for the 18 rotamers have subsequently been combined to obtain the statistics for the six rotamer classes presented in Table 5. The spiral (ggg) is the most frequent class of rotamers (42%), followed by the corner (ggg′, 20%) and within a narrow margin the trans–spiral (tgg, 17%). Thus, although the trans–spiral is ignored in existing classification schemes (see section 3.2.5), it is actually an ubiquitous rotamer, which comes as no surprise given that tgg appears as the class with the second lowest energy in Table 4. The remaining three classes give a total of 21%. The deviations of the dihedral averages in Table 5 from the idealized values (±60°, 180° for χ2 and χ2′, and ±90° for χ3) are closely matched by those of the DFT values listed in Table 4. For example, the large value found for χ3 in the staple is in excellent agreement with the DFT result for χ3 in this rotamer. Also the dependence of Cα−Cα distance on rotamer class (Table 4) is remarkably well reproduced by the DFT calculations (Table 5). Thus, while the conformations of the individual disulfides are distorted by interactions with the protein environment (Table S3), the class averages (Table 5) appear to have conserved the characteristics of the diethyl disulfide rotamers, which are the basis of the proposed six-class taxonomy (Table 4). To investigate to what extent the members of the classes are clustered around the averaged conformations we have refined the six-class taxonomy by requiring the dihedral angles to belong to a set of narrower ranges centered at the averages, χi, for rotamers i = 1 − 18: χ2(χ2iΔχ2,χ2i+Δχ2), χ3(χ3iΔχ3,χ3i+Δχ3), and χ2(χ2iΔχ2,χ2i+Δχ2). Using Δχ2 = 40° and Δχ3 = 30°, instead of the original ranges Δχ2 = 60° and Δχ3 = 90°, only 10% of the disulfides is found to elude classification (the classes are affected comparably, cf. Supporting Information, Tables S5 and S6), compared to 85% for the hypothetical case that the disulfides were dispersed randomly. Hence, the proposed six-class taxonomy is strongly supported by the protein structure data.

Table 5
Distribution of disulfide conformations in proteins over the six-class taxonomy and averages for dihedral angles and distances

3.2.5. Comparison of six-class taxonomy with earlier classifications

Several classification schemes for the disulfide conformations in proteins have been proposed in the literature [11, 13, 21]. These schemes have in common that the side-chain dihedrals, χ2, χ3, and χ2′, are classified according to their signs (+ or −), leading to a total of 3 classes (Table 6), while in our taxonomy χ2 and χ2′ are classified as g, g′, or t, resulting in a total of 6 classes (Table 4). The correlation between the three-class and six-class taxonomies is shown in Table 6. For example, the hook of the three-class taxonomy includes disulfides of the trans–spiral, corner, trans, and hook of the six-class taxonomy.

Table 6
Correlation between disulfide classifications according to three-class and six-class taxonomies a

In consulting the literature one should be aware that the trivial names used for the rotamer classes in proteins depend on the source. Table 7 lists recurrent trivial names and how they are defined by various authors. For example, the terms “corner” as defined by Harrison and Sternberg [11] and “hook” as defined by Schmidt, Ho, and Hogg [21] refer to the same rotamer of the three-class taxonomy but describe distinct species in the six-class taxonomy. The motif unanimously called “staple” by the latter authors is featured as a “short hook” by Hutchinson and Thornton [13]. By definition, the use of the descriptors “trans” and “trans–spiral” is reserved for the six-class taxonomy (Table 7).

Table 7
Definition of descriptors used for classifying disulfide rotamers in proteins a

3.2.6. Disulfide rotamer strain in the kringle fold

The values for χ2, χ3, and χ2′ listed in Table 3 indicate that cystines b and c belong to the class 2, tgg = {R-tgg, L-tgg, R-ggt, L-ggt}, of trans-spiral rotamers in both the K and FN2 families. The cystine b is invariably L-ggt, while cystine c is L-ggt with two exceptions (R-ggt in 1pk4 and 2fd6).

Although cystines b and c belong to the same class (tgg), their Cα−Cα distances (5.9 Å for b and 6.5 Å for c) are remarkably different (Table 3). The distance in cystine b nearly coincides with the energy minimum for conformation tgg (Figure 5) located at 5.84 Å (Table 4) and is thus “relaxed”. The Cα−Cα distance in cystine c is 0.6 Å longer than at the tgg minimum because the disulfide stretches across the β-meander (Figure 3A), invoking “strain” in the disulfide conformational state. A similar situation is found in all kringle domains (Table 3). Cystine b in NT/K has a conformation with χ2 = −69°, χ3 = −80°, χ2′ = −172° versus −71°, −90°, −176° obtained by the DFT calculations at 5.9 Å for class tgg (Figure 6), yielding an average deviation of only 5° for the three dihedrals. Cystine c in NT/K has a conformation with χ2 = −97°, χ3 = −96°, χ2′ = −157° versus −94°, −96°, −165° calculated by DFT at 6.5 Å (Figure 6), giving an average deviation of 4°. A comparison of cystines c and b reveals significant changes in the dihedrals: Δχ2 = −28° / −23°, Δχ3 = −16° / −6°, and Δχ2′ = +15° / +11° (observed / calculated). The observed trends in both the magnitude and sign of the strain-induced changes in the dihedrals are consistent with the DFT calculations.

Figure 5 shows that the observed tgg conformations are not the lowest-energy conformations for diethyl disulfide at the Cα−Cα distances of cystines b and c: ggg and tgt are the lower in energy than tgg at the Cα−Cα values for cystine b (5.9 Å) and cystine c (6.5 Å), respectively, albeit by small margins of 100-300 cm-1. Given the smallness of these energies it is not surprising that the inclusion of interactions ignored in the diethyl disulfide model, such as hydrogen bonding with the backbone, may alter the order of the conformational energies.

3.2.7. Disulfide class and function

Disulfides in proteins have been functionally classified as structural, catalytic, or allosteric bonds [21]. Structural bonds function as stabilizing units of protein structure; catalytic bonds are involved in electron storage by cycling between oxidized (disulfide) and reduced (dithiol) conformations, and allosteric bonds act as potential redox switches for protein function. An attempt to establish a relationship between functional and structural classes has been reported by Hogg and coworkers [21]. Pursuing the idea that a high conformational energy presents a favorable condition for allosteric function, they have evaluated the energies of a large number of disulfides, using the AMBER force field [22], and proposed that the allosteric and catalytic bonds belong, respectively, to the structural classes of the “staple” and “hook” rotamers. This proposal is tempting but also raises a number of issues. (1) AMBER is not a reliable tool for calculating conformational energies of disulfide bonds. The AMBER force field accounts only for torsion energies but ignores distance-dependent contributions due to steric interactions. As a result, the AMBER force field underestimates the steric repulsion arising from the close contact between the Cα atoms in the staple conformation of the six-class taxonomy, which sets this rotamer apart in terms of potential energy among the six classes (Table 4, cf. section 3.2.2). (2) The meaning of the term “strain energy” is ambiguous. In section 3.2.6 strain energy is defined as the distortion energy of a disulfide relative to the equilibrium conformation for the class to which the disulfide belongs. In our definition, a disulfide with a relaxed staple conformation has zero strain energy. Confusingly, the same disulfide appears as severely “strained” if the equilibrium energy of the spiral is taken as a reference [21]. We note that among the fold-defining overlapping disulfides in the plasminogen kringle 5 (PGN/K5), viz. the trans–spirals b and c in the six-class taxonomy, it is the strained disulfide (cystine c, see section 3.2.6) that is subject to cleavage. This observation lends support to the relevance of the strain concept as defined in this paper in the context of disulfide cleavage. (3) The structural classes used in ref. [21] refer to the inadequate three-class taxonomy (Table 6). For example, the “staple” of the three-class taxonomy comprises disulfides from the trans, hook, and staple classes in the six-class taxonomy (Table 6). As disulfide conformations are distinct in geometry and relative energy (Figure 4 and Table 4), the six-class taxonomy is more accurate than the three-class taxonomy for exploring the structural origin of the functional differentiation of disulfides. (4) There are notable exceptions to the proposed rule. For example, the “allosteric” disulfide cystine c in PGN/K5, which is cleaved in the formation of angiostatin and microplasmin from plasmin [23], is a spiral and not a staple in the context of the three-class taxonomy (N.B. Cystine c is trans–spiral in the six-class taxonomy, cf. point (3)).

It is important to bear in mind that insight into quantitative structure–function relationship is only attainable by tackling the problem from a mechanistic perspective. In this context, our preliminary DFT calculations on disulfides within a protein are revealing, namely by pointing toward the importance of hydrogen bonding interactions between the backbone amide groups and the disulfides in determining its energy and conformation. As hydrogen bonds are known to have a major impact on redox potentials, for example those of iron–sulfur proteins [24], they are expected to be one of the key determinants of allosteric or catalytic function of disulfides. Additionally, hydrogen bonding may play a role in ensuring the reversibility of disulfide cleavage by keeping the resulting thiols in the immediate spatial vicinity.

4. Conclusions

We have characterized the kringle fold on the basis of both the disposition of cystines relative to the conserved hydrogen bonding pattern of the backbone and cystine descriptors. The metrics of the kringle fold are quantified, revealing structural differences in the geometrical descriptors of the core, including dihedral angles and distances for the cystine-pair cluster, between the K and FN2 families of the kringle superfamily. The geometrical descriptors of the core together with family specific secondary structure features provide quantitative criteria for protein structure comparison and evolutionary relations between families. By reason of these criteria, the neurotrypsin kringle domain (i) belongs to the kringle fold, (ii) has the conserved arrangement of cystine clusters and the secondary structure of the K family, and (iii) lacks the distinct conserved β-structure of the FN2 family. Hence, NT/K is an unambiguous member of the K family and not a structural intermediate between the K and FN2 families as it was previously suggested [5].

A new classification for the disulfide conformations in proteins is proposed on the basis of density functional theory calculations for diethyl disulfide. This taxonomy differs from previous three-class taxonomy in that there are six rotamer types: spiral, trans-spiral, corner, trans, hook, and staple. The six-class taxonomy is confirmed by statistical analysis of high-resolution X-ray protein structures. The trans-spiral conformation, adopted by the disulfides of the kringle fold, is identified as a ubiquitous new class of disulfide rotamers in proteins.

Supplementary Material

01

Supplementary Information:

Table S1 lists the numbers of residues in the loops of proteins from the kringle superfamily. Tables S2 presents the results of pair-wise structural alignments of proteins from the kringle superfamily. Table S2 gives, for each comparison, the average RMSD for the Cα atoms of the inner disulfide bridges and proximal core residues (residues 18, 47, 48, 57–60, 68–72), evaluated for the optimal match of these atom sets. Table S3 presents the dihedral angles and Cα−Cα distances (denoted R) of 453 disulfides in the high-resolution X-ray structures of 198 proteins, selected according to criteria specified in Methods. Tables S4 - S6 present the statistics for the 18 rotamers (see section 3.2.4). Figure S1 presents potential energy surface for diethyl disulfide as a function of the dihedrals χ2 and χ2′ obtained by DFT calculations in the region of interest for the kringle fold. Figure S2 presents plots of the dihedrals χ2, χ3, and χ2′ vs. Cα-Cα distance obtained by relaxed scans of DFT total energies in Figure 5 for the rotamers tgt and tgg′ of diethyl disulfide.

Acknowledgments

This work was sponsored by NIH Grants HL-29409 and EB-0001474.

Abbreviations

apoA/K4
apolipoprotein A kringle 4
DFT
density functional theory
FA12
factor XIIa
FN2
fibronectin type 2 domain
HABP2
hyaluronan-binding protein 2
HGF/K1
hepatocyte growth factor kringle 1
HGFA
hepatocyte growth factor activator
K
kringle domain
M6PR
mannose 6-phosphate receptor
MMP2/2
second domain of matrix metalloproteinase 2
NT/K
neurotrypsin kringle
PGN/K4
plasminogen kringle 4
PGN/K5
plasminogen kringle 5
PDB
Protein Data Bank
PPII
left-handed polyproline II helix
SCOP
structural classification of proteins
SDF
small disulfide-rich fold
THRB/K1
prothrombin kringle 1
tPA/K2
tissue plasminogen activator kringle 2
RMSD
root-mean-squares deviation
ROR1
tyrosine-protein kinase transmembrane receptor 1
uPA/K
urokinase-type plasminogen activator kringle

Footnotes

aAs a note of caution, we emphasize that Figure 3A illustrates the topology of the network of hydrogen bonds and disulphide bridges but not protein geometry. For example, one should not be misguided by the “antiparallel” and a “parallel” orientations for the flanking β-strands of cystines b and c in Figure 3A, respectively.

bAccording to the definition of a disulfide β-cross the C-terminal half-cystine residues are in register with respect to the main-chain hydrogen-bonding of the β-hairpin at wide pair positions [5].

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1. Doolittle RF. Similar amino acid sequences - chance or common ancestry. Science. 1981;214:149–159. [PubMed]
2. Chothia C, Lesk AM. The Relation between the divergence of sequence and structure in proteins. EMBO Journal. 1986;5:823–826. [PubMed]
3. Holm L, Sander C. New structure - novel fold? Structure. 1997;5:165–171. [PubMed]
4. Patthy L. Evolution of the Proteases of Blood-Coagulation and Fibrinolysis by Assembly from Modules. Cell. 1985;41:657–663. [PubMed]
5. Ozhogina OA, Trexler M, Banyai L, Llinas M, Patthy L. Origin of fibronectin type II (FN2) modules: Structural analyses of distantly-related members of the kringle family identify the kringle domain of neurotrypsin as a potential link between FN2 domains and kringles. Protein Science. 2001;10:2114–2122. [PubMed]
6. Mulichak AM, Tulinsky A, Ravichandran KG. Crystal and molecular structure of human plasminogen kringle 4 refined at 1.9 Å resolution. Biochemistry. 1991;30:10576–10588. [PubMed]
7. Gehrmann M, Briknarova K, Banyai L, Patthy L, Llinas M. The col-1 module of human matrix metalloproteinase-2 (MMP-2): Structural/functional relatedness between gelatin-binding fibronectin type II modules and lysine-binding kringle domains. Biological Chemistry. 2002;383:137–148. [PubMed]
8. Cheek S, Krishna SS, Grishin NV. Structural classification of small, disulfide-rich protein domains. Journal of Molecular Biology. 2006;359:215–237. [PubMed]
9. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP - a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology. 1995;247:536–540. [PubMed]
10. Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA. The CATH classification revisited-architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Research. 2009;37:D310–D314. [PMC free article] [PubMed]
11. Harrison PM, Sternberg MJE. The disulphide beta-cross: From cystine geometry and clustering to classification of small disulphide-rich protein folds. Journal of Molecular Biology. 1996;264:603–623. [PubMed]
12. Richardson JS. The anatomy and taxonomy of protein structure. Advances in Protein Chemistry. 1981;34:167–339. [PubMed]
13. Hutchinson EG, Thornton JM. PROMOTIF - A program to identify and analyze structural motifs in proteins. Protein Science. 1996;5:212–220. [PubMed]
14. Valdar WSJ, Thornton JM. Protein-protein interfaces: Analysis of amino acid conservation in homodimers. Proteins-Structure Function and Genetics. 2001;42:108–124. [PubMed]
15. Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Montgomery JA, Vreven T, Kudin KN, Burant JC, Millam JM, Iyengar SS, Tomasi J, Barone V, Mennucci B, Cossi M, Scalmani G, Rega N, Petersson GA, Nakatsuji H, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Klene M, Li X, Knox JE, Hratchian HP, Cross JB, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Ayala PY, Morokuma K, Voth GA, Salvador P, Dannenberg JJ, Zakrzewski VG, Dapprich S, Daniels AD, Strain MC, Farkas O, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Ortiz JV, Cui Q, Baboul AG, Clifford S, Cioslowski J, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Gonzalez C, Pople JA. Gaussian 03, Revision B.05. Gaussian, Inc.; Pittsburgh PA: 2003.
16. Morgunova E, Tuuttila A, Bergmann U, Isupov M, Lindqvist Y, Schneider G, Tryggvason K. Structure of human pro-matrix metalloproteinase-2: Activation mechanism revealed. Science. 1999;284:1667–1670. [PubMed]
17. Ozhogina OA, Grishaev A, Bominaar EL, Patthy L, Trexler M, Llinas M. NMR Solution Structure of the Neurotrypsin Kringle Domain. Biochemistry. 2008;47:12290–12298. [PMC free article] [PubMed]
18. Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. [PubMed]
19. Petersen MTN, Jonson PH, Petersen SB. Amino acid neighbours and detailed conformational analysis of cysteines in proteins. Protein Engineering. 1999;12:535–548. [PubMed]
20. Gorbitz CH. Conformational properties of disulfide bridges .2. Rotational potentials of diethyl disulfide. Journal of Physical Organic Chemistry. 1994;7:259–267.
21. Schmidt B, Ho L, Hogg PJ. Allosteric disulfide bonds. Biochemistry. 2006;45:7429–7433. [PubMed]
22. Weiner SJ, Kollman PA, Case DA, Singh UC, Ghio C, Alagona G, Profeta S, Weiner P. A New Force-Field for Molecular Mechanical Simulation of Nucleic-Acids and Proteins. Journal of the American Chemical Society. 1984;106:765–784.
23. Stathakis P, Lay AJ, Fitzgerald M, Schlieker C, Matthias LJ, Hogg PJ. Angiostatin formation involves disulfide bond reduction and proteolysis in kringle 5 of plasmin. Journal of Biological Chemistry. 1999;274:8910–8916. [PubMed]
24. Stephens PJ, Jollie DR, Warshel A. Protein control of redox potentials of iron-sulfur proteins. Chemical Reviews. 1996;96:2491–2513. [PubMed]