Fold group 1: C2H2-like finger
Domains from this group are composed of a β-hairpin followed by an α-helix that forms a left-handed ββα-unit (Table ). Two zinc ligands are contributed by a zinc knuckle (a unique turn with the consensus sequence CPXCG) (
3,
36) at the end of the β-hairpin and the other two ligands come from the C-terminal end of the α-helix. The fold group consists of two families: C2H2 fingers and IAP domains.
C2H2 finger family. The C2H2 zinc finger motif (classic zinc finger) was first discovered in the
Xenopus laevis transcription factor IIIA, and has since been found to be present in many transcription factors and in other DNA-binding proteins (1ncs, 1zfd, 1tf6, 1ubd, 2gli, 1bhi, 1sp2, 1rmd, 2adr, 1znf, 1aay, 1sp1, 1bbo, 2drp, 1yui, 1ej6, 1klr) (
16–
18), which recognize specific sequences of DNA (Table , and Fig. A and B). The classical C2H2 zinc finger typically contains a repeated 28–30 amino acid sequence, including two conserved cysteines and two conserved histidine residues. However, other combinations of Cys/His as the zinc-chelating residues are possible. Nucleic acid-binding C2H2 fingers bind to the major groove of DNA through the N-terminus of the α-helix. Recognition of specific DNA sequences is achieved by the interaction of the DNA base with side chains from the surface of the α-helix (
37). Although regulation of transcription seems to be the most important task performed by the C2H2 zinc fingers, recently determined structures of this class suggest their roles in mediating protein–protein interactions (1k2f, 1fv5, 1fu9) (
38,
39).
IAP domain family. The inhibitor of apoptosis (IAP) contains a CCHC (similar to U-shaped transcription factor from C2H2 finger family) pattern that coordinates a zinc ion (1e31, 1jd5, 1c9q, 1g73) (Fig. B). The IAPs have been reported to regulate programmed cell death by inhibition of caspases (
40). Structures of the baculovirus inhibitor of apoptosis repeat (BIR) domains and the anti-apoptotic protein ‘survivin’ contain a conserved core made up of a central three-stranded β-sheet and four short α-helices (
41). The zinc-binding region of IAP structurally resembles the classic C2H2 motif (Fig. B) in that the first two ligands are from a knuckle, and the other two ligands come from the C-terminal region of a broken α-helix (Fig. B). A PSI-BLAST alignment of the zinc-binding region of the BIR domains is shown along with the sequence of the U-shaped transcription factor (1fu9) (
42) to highlight the similarities between the BIR domains and the classical C2H2 zinc fingers and to show the variability of the linker length between the last two zinc ligands in different BIR proteins (Fig. C). Despite the structural resemblance of the zinc-binding site in classic C2H2 domains and IAP domains, we do not have convincing evidence for homology between them and thus conservatively place them into two distinct families.
Fold group 2: Gag knuckle
The structure of this fold group is composed of two short β-strands connected by a turn (zinc knuckle) followed by a short helix or a loop (Fig. A and B). Two N-terminal zinc ligands are donated by the zinc knuckle and two others come from the loop or are placed at both ends of a short helix. The Gag-knuckle resembles the classical C2H2 motif with a large part of the helix and the β-hairpin truncated. Gag-knuckles are thus very short (about 20 amino acids) as compared with C2H2-like domains (about 30 residues).
This group contains C2HC zinc fingers from the retroviral gag proteins (nucleocapsid) that are referred to in literature as zinc knuckles (
16,
43,
44). However this term has also been used previously to describe a unique turn with the consensus sequence of CPXCG where the cysteines contribute to zinc binding (
3,
36). Thus for the sake of clarity we refer to the zinc finger from the retroviral gag proteins as the ‘Gag knuckle’. We consider three families.
Retroviral Gag knuckle family. In retroviral Gag knuckle, a one-turn α-helix follows the β-hairpin (Fig. B). The structure of this motif has been reported from the retroviral nucleocapsid (NC) protein from HIV and other related viruses (1a1t, 1a6b, 1dsq, 1dsv). The Gag knuckle binds to single-stranded RNA and is involved in recognizing specific sequences of RNA needed for viral packaging (
16). Unlike the C2H2 motif, where the zinc finger is repeated many times, the Gag knuckles are mostly found as two conserved domains separated by a small linker region (
44).
Polymerase Gag knuckle family. We have included the structure of a zinc finger from the A subunit of RNA polymerase II (1i3q; Fig. B) (
31) in this fold group. The structure of the Gag knuckle from RNA polymerase II aligns with a root mean square deviation (RMSD) of 3.5–4.4 Å with different members of the retroviral NC proteins (104 atoms). The function of this zinc finger from RNA polymerase, however, remains unknown, although we can hypothesize that it may be involved in RNA-binding. In retroviral Gag-knuckles, a one-turn helix follows the hairpin. In the polymerase Gag-knuckle, this helix is substituted by a loop (Fig. B).
Reovirus outer capsid protein σ
3 Gag knuckle family. The reoviral outer capsid protein σ3 (
45) includes a zinc-binding motif that can be best described as a Gag knuckle (1fn9). The structure of the zinc-binding region resembles that from RNA polymerase and consists of a knuckle followed by a loop. Mutation of the zinc-binding residues has been shown to not affect binding of σ3 to double-stranded RNA but eliminates the ability to associate with the capsid protein µ1 (
46). PSI-BLAST searches with sequences of the NC proteins of HIV, the zinc-binding region of RNA polymerase II and that from reovirus outer capsid protein σ3 fail to find links to each other, and thus we conservatively place them in different families.
Fold group 3: treble clef finger
The treble clef motif consists of a β-hairpin at the N-terminus and an α-helix at the C-terminus that contribute two ligands each for zinc binding. The first two ligands come from the zinc knuckle and the other two ligands are donated by the N-terminal turn of the helix (Fig. ). In most treble clef fingers, a loop and a β-hairpin are present between the N-terminal β-hairpin and the C-terminal α-helix. This loop and the β-hairpin (sometimes substituted by a helix or a pair of helices) vary in length and conformation (Figs and ).
Treble clef fingers are present in a diverse group of proteins that frequently do not share sequence and functional similarity with each other. Previous analysis (
3) has revealed that proteins from seven different SCOP folds (version 1.53) (
7) contain the treble clef finger as a structural core. In the present work, we detect additional members in the treble clef fold group. In some members of this fold group, the treble clef finger is the only domain present. However, in most cases, treble clef motifs are found to be incorporated in multi-domain proteins or are augmented by additional secondary structural elements. In some proteins, tandem or overlapping treble clefs are present possibly due to duplication events [LIM domain (1iml), FYVE domain (1vfy)] (
3). We provisionally divide this fold group into 10 families.
RING finger-like. A number of proteins contain a conserved 40–60 residue cysteine-rich domain, which binds two zinc ions and is termed C3HC4 zinc-finger or ‘RING’ finger. Along with the classic two-zinc RING fingers (1chc, 1bor, 1jm7, 1rmd, 1fbv, 1g25, 1ldj, 1e4u) we include the Pyk2-associated protein β ARF-GAP domain (1dcq) in this family. The structure of the ARF-GAP finger is very similar to that of the RING fingers (0.94–2.13 Å; 104 atoms) and there is a residual sequence similarity. However the ARF-GAP finger lacks the second zinc-binding site that is present in the RING fingers.
Protein kinase cysteine-rich domain. This family includes the C-terminal domain of the human TfIIh P44 subunit (1e53), and cysteine-rich domains from kinases (1ptq, 1faq, 1kbe). The zinc-binding sites are at similar locations to that of RING fingers. The topological difference between the protein kinase cysteine-rich domain and RING finger structures has been explained on the basis of a circular permutation (
3) and this family may be evolutionarily related to the RING finger-like proteins.
Phosphatidylinositol-3-phosphate binding domain. This family includes the zinc-binding regions of the FYVE domain (1vfy, 1dvp, 1joc) that bind phosphatidylinositol-3-phosphate with a high specificity, the effector domain of rabphilin-3a (1zbd) and the PHD zinc fingers (1f62, 1fp0). These domains bind two zinc ions and consist of an overlapping doublet of treble clef finger domains (
3).
Nuclear receptor-like finger. This family mostly consists of domains that do not contain any additional secondary structural elements N- or C-terminal to the treble clef motif; however, several proteins of this family contain duplicated treble clef domains. Nuclear receptor-like fingers are mostly nucleic acid-binding proteins involved in transcription and translation and are likely to be evolutionarily related. The family contains the structures of the S14 (chain N of 1fjf) and L24E (chain T of 1jj2) ribosomal proteins, domain in MutM protein (1ee8, 1l2b), domain in endonuclease VIII (1k3w), the C-terminal domain of ile-tRNA synthetase (1ffy), the DNA repair factor XPA zinc-binding domain (1xpa), GATA-1 (4gat, 2gat, 1gnf), the nuclear receptor DNA-binding domain (1hcq, 1kb6) the LIM domain (1b8t, 1iml, 1zfo, 1g47), the I-TevI endonuclease zinc finger (1i3j) and the ribosomal protein L31 (1lnr).
A majority of these proteins have been analyzed elsewhere (
3) and here we discuss three examples of the most deviant proteins that we feel belong to this family.
Nuclear receptor DNA-binding domain. The structure of the estrogen receptor DNA-binding domain (1hcq) (
47) and its homologs have two zinc-binding sites (Fig. ). Like the Gag knuckles, each steroid hormone receptor contains two repeats (domains) of a finger that binds zinc ions via four conserved cysteines. The first, N-terminal site (domain) is a typical treble clef motif where two of the ligands come from an α-helix and two more are contributed by a knuckle. The second, C-terminal site (domain) shows only partial resemblance to the treble clef finger (Fig. ). Like in classical treble clef fingers, two of the zinc ligands are located in the N-terminal turn of an α-helix. However, the zinc knuckle, which donates the other two ligands in treble clefs, is absent in this domain and this zinc half-site appears very different structurally from the first zinc half-site of treble clef domains. Despite this structural difference, there is also some resemblance. The two cysteines in this half-site are flanked by extended regions that are conformationally similar to the short β-strands in classical treble clefs. It is likely that the second finger is a result of duplication of an ancestral treble clef domain [see SCOP scopid = d1hcqa_ (
7)], after which substantial structural changes have occurred. These changes involved deterioration of the β-hairpin and the zinc-knuckle in it, which resulted in a structural reorganization of the first half-site. Our hypothesis about homology between the two zinc-binding domains is based on three lines of evidence: (i) duplications are among the most common events in molecular evolution and it may be easier for a protein to duplicate a domain than to construct one
de novo; (ii) structural similarity of the second domain to the first: classical treble clef domain manifested in the second zinc sub-site and a helix complemented by the local conformational similarity in the regions of the first sub-site; (iii) variability of the second domain sequences in the region of the first sub-site, in particular, variability in the length of the linker between the first two cysteines (Fig. ). Such variability indicates that the first sub-site may be prone to sequence, and thus structural, changes that do not affect the function of the protein, which may explain the differences from the typical treble clef arrangement of secondary structural elements.
I-TevI endonuclease zinc finger. I-TevI endonuclease belongs to the GIY-YIG family of intron-encoded endonucleases. The DNA-binding domain of intron endonuclease I-TevI (1i3j) (
48) is a zinc finger that we classify as a treble clef. This domain is the shortest among treble clef fingers (Fig. ). Its α-helix is deteriorated to a single turn. The zinc finger in I-TevI interacts with the DNA through two hydrogen bonds with the phosphate backbone at the minor groove, and is not seen to make any base-specific contacts. The general orientation of I-TevI finger on DNA is similar to the one typical for treble clef domains in which the α-helix forms most contacts with DNA. Due to the short length of the α-helix (one turn), this zinc finger also resembles the zinc ribbons and aligns with an RMSD of 1.41–5.35 Å (96 atoms) with other zinc ribbons, but lacks the third strand found in most zinc ribbons.
Ribosomal protein L31—a deteriorated treble clef finger. The structure of the large ribosomal subunit from
Deinococcus radiodurans (
30) reveals the L31 protein (Chain Y of 1lnr) as a treble clef finger in which the zinc-binding ligands are replaced with other residues. Despite the absence of the zinc-binding site, the structure of L31 contains all the properly oriented elements of the classical treble clef finger, such as the β-hairpin with the zinc knuckle, inserted β-hairpin and the C-terminal α-helix and thus undoubtedly belongs to this fold group (Fig. ) along with two other ribosomal proteins L24E and S14. Another unusual feature of L31 treble clef is the presence of a long 25-residue insertion that follows the knuckle β-hairpin and is itself structured as a β-hairpin. Insertions after the knuckle β-hairpin are not uncommon among treble clefs and have been noted before to be of variable length (from 0 to 6 residues) and conformation (
3), however, L31 possesses the longest insertion among them.
In this and later cases we hypothesize that complete or partial absence of the zinc ligands in zinc finger-like structures is due to a loss of ligands rather than a gain of ligands and zinc-binding property. The arguments for this hypothesis are 2-fold. First, the majority of the proteins from various phylogenetic lineages have well formed zinc sites. Typically, only a few isolated phylogenetic groups or families and maybe not even all representatives of these groups have zinc ligands absent. Second, the structures around zinc-binding sites are frequently unusual and have backbone and side-chain geometries that are probably not among the most favorable conformations in globular proteins. This geometry has probably arisen in conjunction with the zinc site formation.
YlxR-like hypothetical cytosolic protein. The hypothetical cytosolic protein SP0554 coded by the gene from Nusa/Infb operon of
Streptococcus pneumoniae [1g2r (
49)] is another example of a treble clef finger with a deteriorated zinc-binding site. In contrast to the L31 protein, for which no close homologs can be found with the zinc-binding site still intact, PSI-BLAST searches reveal many such homologs for the SP0554 protein (Fig. A). YlxR family, to which SP0554 belongs, shows examples of partial zinc-binding site deterioration with some members retaining only one or two cysteines/histidines at the sites occupied by zinc ligands (Fig. A). The structure of this family is characterized by an additional β-strand inserted in the secondary β-hairpin of the treble clef (between the first knuckle and the helix) and an additional pair of α-helices at the C-terminus (Fig. B, shown in gray).
t-RNA synthetase treble clef domain. The structure of prolyl-tRNA synthetase from
Thermus thermophilus (1hc7) (
50) unexpectedly revealed the presence of a circularly permuted treble clef finger (Fig. ). This is the first instance of a permuted treble clef finger seen among available protein structures. The N- and C-termini of the prolyl-tRNA synthetase treble clef are placed in the turn of the secondary β-hairpin and the α-helix is connected to the primary β-hairpin through an extended linker, part of which forms a β-strand hydrogen-bonded to the secondary β-hairpin (Fig. ) in a manner similar to that observed in RING fingers.
NAD+-
dependent DNA ligase treble clef domain. The structure of the NAD
+-dependent DNA ligase from
Thermus filiformis (1dgs) contains a treble clef finger inserted between the oligonucleotide binding (OB)-fold domain and helix– hairpin–helix (HhH)-containing domains (
51). This zinc finger is among the shortest known treble clef domains, and shows partial similarity to zinc ribbons due to the presence of a very short helical segment at the C-terminus, which is fused with the first α-helix of an HhH hairpin, and is placed in this fold group provisionally.
YacG-like hypothetical protein. The recently determined structure of the
E.coli protein YacG (
52) (1lv3) is a treble clef finger since two of its zinc ligands are from a knuckle and the other two are from the first turn of a short helix. No function has been attributed to this protein. PSI-BLAST search with the YacG sequence does not find links to any other known member of the treble clef fold group and hence we place the protein in a separate family.
His-Me endonucleases. This family contains the structures of the DNase domains of colicins E7 and E9 (7cei, 1bxi),
Seratia marcescens endonuclease (1ql0), intron-encoded endonuclease I-PpoI (1a73), T4 recombination endonuclease VII (1en7) and the MH1 domain of Smad (1mhd) and has been discussed in detail previously (
3,
53). The zinc-binding sites in all but the T4 recombination endonuclease (1en7) are deteriorated. The majority of these treble clefs are catalytic and contain an active site histidine (Fig. ), except MH1 domain of SMAD which probably lost its catalytic activity (
53).
RPB10 protein from RNA polymerase II. The RPB10 domain folds into a three-helical bundle typical of helix– turn–helix (HTH)-motif containing transcription factors (1ef4, chain J of 1i3q). However, it contains a zinc-binding site with geometry similar to the one found in treble clef fingers: two ligands come from a knuckle and two others are contributed by the N-terminus of an α-helix. In contrast to classical treble clef domains, the secondary β-hairpin in RPB10 is replaced by two α-helices. Since the secondary β-hairpin is the least conserved part of the treble clef finger and can tolerate long insertions or contain a short helix (Figs and ), we classify the RPB10 domain as a treble clef despite the replacement of the βdash;hairpin by an α-hairpin.
Fold group 4: zinc ribbon
In the zinc ribbon fold group, the ligands for zinc binding are contributed by two zinc-knuckles. The core of the structure is composed of two β-hairpins forming two structurally similar zinc-binding sub-sites (Figs and ). We call one of these hairpins a primary β-hairpin (shown in purple on Figs and ). This β-hairpin contains the N-terminal zinc sub-site in classic zinc ribbon proteins, such as the transcription initiation factor TFIIB (1pft) (
54) and transcriptional elongation factor SII (TfIIS; 1tfi) (
55). The other β-hairpin (secondary; shown in yellow on Figs and ) contains the C-terminal zinc sub-site in classic zinc ribbons. Typically, an additional β-strand forms hydrogen bonds with the secondary β-hairpin, thus most zinc-ribbon domains contain a three-stranded antiparallel β-sheet in their structure. The length of the β-strands in the primary β-hairpin is usually about two to four residues. The β-strands in the three-stranded sheet vary in length, but are frequently longer (4–10 residues). The distance between the two sub-sites can vary considerably and there could be additional domains inserted in between.
The zinc ribbons are arguably the largest fold group of zinc fingers. The zinc ribbons are found in a diverse group of proteins and frequently display limited sequence similarity, which is mainly restricted to the zinc ligands and the zinc-knuckle motifs (Fig. ). This limited sequence conservation is reflected in the structural variability of zinc ribbons. The structural analysis of zinc ribbons reveals that better superpositions are achieved for some of the structures if circular permutation is assumed. Namely, the N-terminal knuckle of one structure is superimposed with the C-terminal knuckle of another structure and the C-terminal knuckle of the first structure is superimposed with the N-terminal knuckle of the second structure (Figs and ). The superposition that assumes circular permutation allows for the inclusion of an additional β-strand (shown in gray on Figs and ). Circularly permuted versions of the zinc ribbon are found in rubredoxins. Also, circular permutation is seen among zinc ribbons that are located as insertions in larger proteins. If the zinc ribbon is present at either terminus in large proteins, then it is generally not permuted.
Structures that possess two knuckles in their zinc-binding sites and thus belong to this fold group fall into two distinct sub-groups defined by the geometry of zinc ligands. There exist two possible mutual orientations of the four zinc ligands placed on a tetrahedron: left-handed and right-handed. The majority of the zinc ribbon structures contain a site with left-handed geometry, namely, if the zinc ligands are numbered consecutively in the sequence and we orient the molecule with the primary hairpin above the secondary hairpin, the counter-clockwise sequence of zinc ligands is 1,4,2,3. However, a few structures, such as 1b55 (
56), 1dfe (
57) (Fig. ) and both domains in 1exk (
58) belong to a right-handed sub-group, for which counter-clockwise sequence of zinc ligands is 1,3,2,4. Conversion from one arrangement to the other can be rationalized through switching the places for ligands 3 and 4.
Due to the significant sequence and structural variability of zinc ribbons, our classification into families is provisional and more work is required to clarify evolutionary relationships within this fold. Evolutionary classification of this fold group is complicated by the fact that there exist zinc-binding sites similar in structure to zinc ribbons but not homologous to them. For instance, some serine proteases like the NS3 protein of hepatitis C virus (1a1r) and the guanine nucleotide exchange factor Mss4 (1fwq) have a zinc-binding site formed by two loops protruding from the structural core. However, in both structures, the zinc-binding site forms neither the core of the molecule nor the center of a separate domain and thus is not included in our analysis.
Classical zinc ribbon. This family mostly includes domains from proteins involved in the translation/transcription machinery, such as transcription factors, primases, RNA polymerases, topoisomerases and ribosomal proteins. Classical zinc ribbons are characterized by a long secondary hairpin (ribbon) and thus a longer three-stranded β-sheet as compared with members of other protein families of this fold group (Fig. , 1tfi, 1qyp). However, in some proteins that are probably homologous to classical zinc ribbons (e.g. zinc ribbons from ribosomal proteins), the secondary hairpin is shorter. The transcription elongation factor SII (TfIIS; 1tfi), the transcription initiation factor TFIIB (1pft, 1dl6) and the N-terminal 12 kDa fragment of DNA primase (1d0q) are typical representatives of this family. It has been argued that the C-terminal domains of prokaryotic DNA topoisomerase (1yua) are homologous to the zinc-ribbon domains in transcription factors (
59). The two topoisomerase domains with available structure show statistically significant sequence similarity to the other members of the family, but do not retain the zinc ligands and probably do not bind zinc.
Several RNA polymerase subunits contain zinc ribbon domains that vary considerably in their length and sequence, but show a typical zinc ribbon structure. This group of zinc ribbons includes the polymerase proteins Rpb1, Rpb2, Rpb9 and Rpb12 (1i50 chains A, B, I, L, respectively and additionally 1qyp for Rpb9 fragment). The Rpb9 (chain I of 1i50) contains two zinc-binding domains separated by 40 residues. The structure of the C-terminal domain determined previously (1qyp) (
36) was shown to form a zinc ribbon motif similar to that of the transcription factor IIS. Rpb12 (chain L of 1i50) forms a circularly permuted ribbon (Fig. ). An unusual feature of the Rpb1 C-terminal zinc sub-site is that the two cysteine ligands are separated by an 18 residue long loop (Fig. ).
The structure of the large γ subunit of the initiation factor e/aIF2 from Pyrococcus abyssi (1kjz) has a circularly permuted zinc ribbon. Due to its close resemblance to the transcription factors, we group it with the classical zinc ribbons.
The structure of the replication protein-A 70 kDa subunit (Rpa70) from the human single-stranded DNA (ssDNA) binding RPA trimerization core (
60) contains a classic zinc ribbon inserted into the OB fold of the DNA-binding domain C. The zinc finger has been shown to modulate DNA binding (
61) although the exact functional role of this zinc finger remains unclear.
Another major subfamily of the classical zinc ribbons are ribosomal proteins. Representative structures from this subfamily are the 50S ribosomal proteins L44E (chain 2 of 1jj2), L37E (chain Z of 1jj2), L37Ae (chain Y of 1jj2) from
Haloarcula marismortui (
29), and L32 (chain Z of 1lnr), L33 (chain 1 of 1lnr) from
D.radiodurans (
30). In L44E, a large insertion (~40 residues as compared with L37Ae) exists between the two zinc-binding sub-sites. Zinc fingers are susceptible to replacements of zinc ligands and a consequent loss of zinc binding properties. This is also seen in the ribosomal protein L33 (chain 1 of 1lnr). Although the protein from
D.radiodurans does not have zinc ligands, a PSI-BLAST search finds its close homologs with cysteines intact (data not shown).
Based on pronounced structural similarity and residual sequence similarity we place zinc ribbon domains of enzymes such as aspartate transcarbamoylase and casein kinase in this family (Fig. ). The zinc ribbon of casein kinase II (1qf8) (
62) aligns structurally with an RMSD of 1.13 Å (96 atoms) with TFSII (1tfi).
The cluster binding domain of Rieske iron sulfur protein. Iron sulfur proteins (ISPs) play a key role in electron transfer. The Rieske ISP is a high potential 2Fe–2S protein (
63). The cluster binding domain of the Rieske ISP has a rubredoxin-like fold (1ezv, 1rfs, 1g8k, 1eg9, 1fqt), which coordinates a (
63) cluster by two His and two Cys residues located in two knuckles. One of the Fe ions is coordinated by two cysteines and the other one is coordinated by two histidines. The domain is additionally stabilized by a disulfide bridge between the two knuckles, however, this disulfide link is not conserved among all structures. Also the ligand at the second position of the primary knuckle as seen in zinc ribbons and rubredoxins is not conserved among the ISP.
The adenovirus DNA-binding protein zinc ribbons. The adenovirus DNA-binding protein (AdDBP), a ssDNA-binding protein of the adenovirus E2A transcriptional unit, contains two zinc-binding motifs that are very similar to each other in structure. These zinc-binding motifs resemble the Rpb1 protein of RNA polymerase II (chain A of 1i50) in having long insertions between the two zinc sub-sites and in the region between the two cysteine ligands of the C-terminal sub-site (Fig. ). These domains may be homologous to the classical zinc ribbons.
The B-box zinc finger. The nuclear factor Xnf7 contains a B-box (1fre) domain (
64). B-box structure is composed of two loose knuckles that contribute ligands for zinc binding. The structure of Xnf7 B-box is more distant from other zinc ribbons in not having a three-stranded β-sheet. However, the structure of 1fre fails most of the PROCHECK (
65) tests for the high quality structures and may not be accurate enough for detailed structural comparisons.
Rubredoxin family. Rubredoxins are low molecular weight metal-binding proteins involved in electron transfer. Representative structures of this family are the zinc- substituted rubredoxin (1dx8, 1irn), rubrerythrin (1b71), desulforedoxin (1dxg) and the polypeptide VIa of cytochrome c oxidase (chain F of 2occ). In all rubredoxins except for the cytochrome c oxidase subunit, better alignment with the classical zinc ribbons is achieved under the assumption of circular permutation, which switches the places of N- and C-terminal zinc sub-sites (Fig. ). In such an alignment, the third β-strand in the β-sheet can be matched between rubredoxins and classical zinc ribbons.
Rubredoxin-like domains in enzymes. A wide variety of enzymes contain small zinc ribbons that appear similar and may be related to the rubredoxin domains. Similar to most rubredoxins, the majority of domains in this family are circularly permuted compared with classical zinc ribbons (Fig. ). Known structures of these domains include aminoacyl tRNA synthetases, adenylate kinase and silent information regulator 2 (SIR2). In most of these proteins, zinc ribbons function as interaction modules, e.g. to provide a ‘lid’ for the enzyme’s active site (
66).
The zinc ribbons from the structures of methionine (1f4l, 1a8h), isoleucine (1ile) and valine (1gax) aminoacyl-tRNA synthetases are shown in the alignment (Fig. ). All these tRNA synthetases are of the class I and are characterized by an ATP binding domain with the Rossmann fold topology. Typically two zinc ribbon domains are inserted in the enzyme structure and are circularly permuted compared with classical zinc ribbons. The spacing between the two zinc sub-sites can be very large in some of these proteins, for instance in Met-tRNA synthetase, one zinc ribbon domain is inserted between the two zinc sub-sites of another zinc ribbon domain (Figs and ), with the N-terminal zinc ribbon being permuted. The N-terminal domain in E.coli structure (1f4l) has a deteriorated zinc-binding site (Figs and ). The zinc-binding site is complete in its ortholog from T.thermophilus (1a8h). Incidentally, the second zinc ribbon domain is absent in the T.thermophilus structure.
The structure of adenylate kinase from
Bacillus stearothermophilus (1zin) reveals a zinc ribbon at the active site lid region. The zinc ribbon is shown to play a structural role in stabilizing the bacterial adenylate kinases (
66). The structures of the enzymes from
E.coli (1e4v) (
67) and from maize (1zak) (
68) contain the zinc ribbons, but lack the ligands for zinc binding (Fig. ).
The SIR2 (1ici, 1ma3, 1j8f) contains a zinc ribbon domain as an insertion to the Rossmann-like fold domain (Fig. ). Cysteines in this zinc ribbon are essential for the SIR2 function (
69) and the NAD-binding pocket from the larger domain is seen to be stabilized by the presence of the zinc ribbon motif (
70).
The structure of the 8.3 kDa protein (gene MTH1184) from
M.thermoautotrophicum (1gh9) (
14) does not contain zinc, however, the four cysteines around the potential metal-binding site and the fold of the chain argue for its classification as a zinc ribbon. MTH1184 protein does not have homologs clearly identifiable by sequence similarity searches, but is structurally more similar to proteins of this family. For instance, VAST (
6) aligns 20 residues from all four β-strands of MTH1184 to isoleucyl-tRNA synthetase with RMSD of 1.0 Å.
Btk motif. In this and the next two families, the right-handed arrangement of zinc ligands is present. The Tec family of tyrosine kinases contains a zinc-binding motif (Btk motif) C-terminal from their pleckstrin-homology domain. The zinc-binding motif bears some resemblance to the zinc ribbons by having a three-stranded β-sheet and two of the zinc ligands being contributed by a β-turn in this sheet. To align all four Btk zinc ligands with zinc ribbons, we assume circular permutation of Btk motif in which the first ligand of the zinc ribbon is contributed by a C-terminal fragment of the Btk finger with the other three ligands coming from the N-terminal fragment (Fig. ).
Ribosomal protein L36. The L36 protein (1dfe) (
57) is another unusual zinc ribbon with right-handed placement of the zinc ligands. In zinc ribbons with left-handed ligand arrangement, the two knuckle hairpins are almost perpendicular to each other (Fig. ). In L36 structure the two knuckle hairpins are parallel to each other and, as a consequence, the positions of the ligands 3 and 4 appear to be ‘flipped’ with respect to other members of the zinc ribbon fold group (Fig. ).
Cysteine-rich domain of the chaperone protein DnaJ. The cysteine-rich domain of the chaperone DnaJ (1exk) (
58) contains two zinc-binding sites, each of which is composed of two zinc knuckles. Based on this property, we place the two domains of DnaJ (one zinc-binding site in each) into the zinc ribbon fold group (Fig. ). One of these domains is inserted between the two zinc sub-sites of the other domain. Zinc ligands have a right-handed arrangement and the β-strands characteristic of classical zinc ribbons are absent in DnaJ. Thus DnaJ domains align with other zinc ribbons with a RMSD range of 3.7–6.8 Å (96 atoms). The regions around the two zinc-binding sites in DnaJ are nearly identical to each other (RMSD of 0.88 Å; 96 atoms) and the two domains are homologous.
Functional properties of zinc-finger proteins
Proteins bind zinc as a cofactor for catalysis or as a structural stabilizer. In zinc fingers, the role of zinc is structural and zinc ions typically do not participate in the function directly. Other parts of a zinc-binding molecule bear functional importance. Small protein domains assembled around zinc ions are versatile structural templates that perform various functions. Despite their small size, zinc fingers are functionally more diverse than many larger domains and are seen to be involved in nucleic acid (DNA and RNA) binding, protein–protein interactions, binding small ligands (lipids) (
3), and sometimes also possess enzymatic properties [without zinc participating in catalysis; (
3,
53)]. Executing these functions, zinc fingers are involved in many fundamental cellular processes, such as replication and repair, translation, programmed cell death and metal regulation.
Protein–DNA interactions. Among the eight fold groups, structures of protein–DNA complexes are known for the members from the C2H2-like, treble clef and the Zn2/Cys6 fold groups. The most frequent mode of DNA binding is similar among all these DNA-binding zinc fingers, where the main interactions are formed by the side-chains of residues from an α-helix, which generally binds to DNA at the major groove. This theme of protein–DNA interactions is not restricted to zinc fingers and is seen in 28 out of 54 DNA-binding protein families (
75).
The DNA-binding mode of C2H2 fingers is illustrated by the structure of protein–DNA complex (1aay), in which the α-helix of the finger interacts with the DNA major groove (Fig. A). All C2H2 fingers bind DNA in a similar manner. The sequence specificity and high affinity for DNA binding is achieved by the cooperative binding of the α-helices of several C2H2 zinc fingers arranged in tandem.
The DNA–protein interaction in the treble clef fingers is illustrated by the structures of the estrogen receptor DNA-binding domain (1hcq) (
47) and the structure of the intron endonuclease I-Tevi (1i3j) (
48) (Fig. A). The estrogen receptor belongs to the family of nuclear receptors that are involved in controlling transcription at the hormone response elements, regulated by the binding of steroid hormones. The DNA-binding regions of these nuclear receptors are comprised of two zinc-binding sites, each of which is a treble clef finger. The helices of the two fingers interact with the DNA. The N-terminal α-helix binds to the major groove of DNA and the outer β-strand of the primary β-hairpin interacts with the phosphate backbone. This mode of binding to DNA is shared by most of the treble clef fingers. However, the structure of the DNA-binding domain of the intron endonuclease I-Tevi (1i3j) is an exception to this rule in that the helix of the treble clef finger interacts with the minor rather than the major groove of the DNA and the inner β-strand of the primary β-hairpin is seen to interact with the phosphate backbone.
The Zn2/Cys6 zinc finger is comprised of two α-helices that coordinate two zinc ions via six cysteine residues. The first α-helix binds to the major groove of the DNA and recognizes specific triplets of DNA sequence (1d66, Fig. A). The second α-helix is involved in backbone interactions. The Zn2/Cys6 fingers generally bind to DNA as symmetrical dimers with the dimerization domain located outside the zinc-binding domain.
Protein–RNA interactions. Zinc fingers that interact with RNA were found among the structures of members from the Gag knuckle, the treble clef finger and the zinc ribbon fold groups. The structure of ribosome contains treble clef fingers and zinc ribbons forming contacts with RNA. Protein–RNA interactions are illustrated by the structure of ribosomal proteins L37E (zinc ribbon, chain Z of 1jj2) and L24E (treble clef, chain T of 1jj2) and the Gag knuckle from the HIV-1 nucleocapsid protein (1a1t) (Fig. B). The α-helix of the L24E treble clef finger interacts with the major groove of RNA and the mode of RNA-binding in treble clefs is similar to that of DNA-binding. The L44E, L37E, L37Ae ribosomal proteins from H.marismortui and the L32, L33 proteins from D.radiodurans and the L36 protein from T.thermophilus contain zinc ribbons. These ribbons interact mainly at the major groove of RNA with different parts of the zinc ribbon making contact with RNA.
Protein–protein interactions (homo: 1dxg, 1ici hetero: 1fbv, chain D of 1i5o). Many zinc fingers are involved in protein–protein interactions. Some of these interactions involve dimerization of zinc fingers. Such interactions are illustrated by the structures of desulforedoxin dimer (1dxg) and the zinc-binding domain of the Sir2 homology protein (1ici) (Fig. C). These proteins display different modes of dimerization in zinc ribbons.
Zinc fingers are known to interact with larger proteins. For instance, the structure of
E.coli aspartate transcarbamoylase (
76) reveals that the primary β-hairpin of zinc ribbon from the regulatory chain (D; black in Fig. A) interacts with the catalytic chain (C; blue in Fig. A). The treble clef of the RING finger domain of the signal transduction protein Cbl binds to the ubiquitin-conjugating enzyme Ubch7 (1fbv chains A and C). Residues from the α-helix and knuckle of the treble clef form the majority of contacts in this complex.
Although no structural information about the protein– protein interactions for the zinc-binding domains of the C2H2-like fingers is available, biochemical evidence points to the involvement of the C2H2 domain in mediating protein– protein interactions like in the erythroid FOG-1 and the U-shaped protein from
Drosophila (
38,
42,
77).