|Home | About | Journals | Submit | Contact Us | Français|
HIT (histidine triad)1 proteins, named for a motif related to the sequence HHH, ( a hydrophobic amino acid) are a superfamily of nucleotide hydrolases and transferases, which act on the α-phosphate of ribonucleotides, and contain a ~30 kDa domain that is typically either a homodimer of ~15 kDa polypeptides with two active-sites or an internally, imperfectly repeated polypeptide that retains a single HIT active site. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins can be classified into the Hint branch, which consists of adenosine 5′-monophosphoramide hydrolases, the Fhit branch, which consists of diadenosine polyphosphate hydrolases, and the GalT branch, which consists of specific nucleoside monophosphate transferases including galactose-1-phosphate uridylyltransferase, diadenosine tetraphosphate phosphorylase, and adenylylsulfate:phosphate adenylytransferase. At least one human representative of each branch is lost in human diseases. Aprataxin, a Hint branch hydrolase, is mutated in ataxia-oculomotor apraxia syndrome. Fhit is lost early in development of many epithelially derived tumors. GalT is deficient in galactosemia. Additionally, ASW is an avian Hint family member that has evolved to have unusual gene expression properties and the complete loss of its nucleotide binding-site. The potential roles of ASW and Hint in avian sexual development are discussed in an accompanying manuscript. Here we review what is known about biological activities of HIT proteins, the structural and biochemical bases for their functions, and propose a new enzyme mechanism for Hint and Fhit that may account for the differences between HIT hydrolases and transferases.
Galactose-1-phosphate uridylyltransferase (GalT), the second enzyme in the Leloir pathway of galactose utilization (1), was crystallized and its structure determined to understand the mechanistic basis for transferring UMP between glucose-1-phosphate and galactose-1-phosphate (2-4). Histidine triad nucleotide-binding protein (Hint), purified as a purine nucleoside/nucleotide-binding protein from rabbit heart cytosol (5) and shown to have a widely conserved sequence (6), was crystallized and its structure determined to establish whether the conserved sequence formed the basis for a new mode of nucleotide-binding (7). GalT (2) and Hint are both dimers (8) but the GalT monomer is more than twice the size of the Hint dimer. Though GalT (9) and Hint (6) homologs could be cloned by similarity and the crystal structures (2, 8) co-existed in protein databases, mutual similarity was not noted until inspection of the rabbit Hint crystal structure (7) and computational analysis (10) indicated that a GalT monomer and a Hint dimer can be superimposed and that their nucleotide-binding sites retain some of the same residues for binding a nucleoside monophosphate (7) (Figure 1). The level of sequence similarity of Hint and GalT is difficult to distinguish from noise in a standard BLAST search but can be picked up by PSI-BLAST (11). Nonetheless, the structures of Hint and GalT and modes of nucleotide-binding are convincingly similar and the mechanisms are related as discussed below.
Because Hint genes are found in virtually every sequenced genome and appear to be the common link to fragile histidine triad (Fhit) branch and GalT branch sequences, it is considered that Hint is the ancestral HIT protein (7) and Hint homologs constitute Branch I of the HIT superfamily. Branch I includes at least four human, three chicken and two yeast proteins (Table 1). Unfortunately, certain sequence annotations of HIT proteins persistently refer to the HIT motif as a zinc binding site or to Hint homologs as protein kinase C interactors or inhibitors. These annotations are based on data that has been publicly withdrawn in the case of protein kinase C (12) and on a paper whose data prove that zinc does not bind (8), as we have reviewed earlier (13). Crystallographic analysis demonstrated that fourteen of the most highly conserved residues in the HIT protein superfamily mediate interactions with the purine base, the ribose and the 5′ monophosphate (7). However, while this analysis indicated that Hint homologs are conserved as nucleotide-binding proteins, the enzymatic and physiological functions of Hint homologs were not immediately clear.
Specific substrates for rabbit Hint and its yeast ortholog, Hnt1, emerged alongside physiological assays for the yeast HNT1 gene (17). While deletion of hnt1 was tolerated by haploid yeast cells on glucose media at all temperatures and deletion of hnt1 did not lead to a galactose- phenotype at 30 °C, hnt1 deletion produced cells that are galactose- at 38-39 °C. Rabbit Hint and yeast Hnt1 were both purified from E. coli and treated with a panel of nucleotides. As Fhit and GalT both cleave substrates that can be described as NMP-X (Figure 2), a large panel of dinucleoside polyphosphates (ApnA), nucleoside di- and triphosphates, nucleotide sugars, and other compounds were tested as rabbit Hint and yeast Hnt1 substrates. The only compounds hydrolyzed at greater than ~1 nmol min-1 μg-1 were the adenosine 5′-monophosphoramide linked compounds AMP morpholidate, AMP-N-alanine methyl ester, AMP-N-ε-(N-α-acetyl lysine methyl ester), and AMP-NH2. AMP-NH2 was hydrolyzed at greater than 1,000,000 s-1 M-1 by both rabbit and yeast enzymes. This activity was at least 200,000-fold dependent upon the His116 nucleophile of Hnt1, which corresponds to His112 of mammalian Hint. Just as yeast Hnt1 enzyme activity was destroyed by the H116A mutation, so the hnt1-H116A allele was without function in supporting growth of yeast cells on galactose at 39 °C. Because the Hnt1-H116A protein is not unstable, it was concluded that hydrolysis of an Hnt1 substrate is required for growth of yeast cells at 39 °C (17).
A major clue to Hint function was provided by identification of a two-hybrid interaction between human Cdk7 and human Hint1 (18). Cdk7 is the kinase catalytic subunit of CAK, the Cdc2 activating kinase complex, consisting of Cdk7, cyclin H and MAT-1 (19-21). Additionally, this heterotrimer is part of a larger complex termed general transcription factor TFIIH that contains the major RNA polymerase II C-terminal domain kinase activity (22, 23). Adding to the significance of the findings, the yeast Cdk7 homolog, Kin28, had a weak two-hybrid interaction with yeast Hnt1 and it was demonstrated that the growth of a temperature sensitive mutant in kin28 was impaired by hnt1 deletion on galactose media (18). Though the two-hybrid isolation of Hint as a Cdk7 interactor initially suggested a physical basis for regulation (18), expression of hnt1-H116A in kin28-ts backgrounds was as deleterious as the hnt1 knockout, indicating that Hnt1 enzyme activity is a positive regulator of Kin28 function on galactose media (17).
Though fungal and plant orthologs of mammalian enzymes are typically 50% identical at the amino acid sequence level, the S. cerevisiae Hint1 ortholog Hnt1 is only 22% identical, maintaining little more than the residues that form the nucleotide-binding site and the dimer interface (7). To address the possibility that Hnt1 enzymatic activity is not only necessary but sufficient for Kin28 regulation, rabbit Hint was expressed in place of yeast Hnt1 and showed full suppression of every hnt1 Δ phenotype (17).
One obvious mechanism by which loss of Hnt1 enzyme activity might result in inhibition of a protein kinase was that the as yet unidentified in vivo substrate of Hint, provisionally termed AMP-X, which might be AMP-NH2 or a chemically related molecule, could be an ATP-competitive protein kinase inhibitor. Further, because Cdk7 is both CAK and the kinase component of TFIIH, it was considered that Hnt1 enzyme activity might be a positive regulator of either Kin28 or Cak1. Consistent with the possibility that Cak1 might be the target of Hnt1 regulation, cak1-ts mutants are synthetically less viable with hnt1 mutations. However, when hnt1 mutations were put into a strain that contains CDC28 alterations that bypass the essential function of Cak1, poor growth was observed, indicating that Hnt1 does not act as a positive regulator of Cak1 (17). In fact, in the cak1 Δ hnt1 Δ background, poor growth was observed with glucose and galactose as carbon sources. This experiment excluded Cak1 as the target of Hnt1 regulation and also excluded the possibility that the galactose-dependent nature of hnt1 phenotypes might have been due to AMP-X being produced solely as a byproduct of the Leloir pathway of galactose utilization.
With genetic experiments having shown that Hnt1 functions downstream of Cak1, and that hnt1 mutations synergize with nonphosphorylatable Kin28, overexpressed Kin28, and temperature-sensitive mutations in kin28, ccl1 and tfb3 (17), it remains to be determined precisely what the nature of Hnt1 regulation is and why the defect is easier to discern on galactose media than on glucose. Four models that account for all available data are shown in Figure 3. In each model, the glucose-essential function of Kin28 is depicted as residing in the TFIIH complex. Kin28 is also shown as existing in a CAK-like trimer consisting of Kin28, Ccl1 and Tfb3, which has recently been shown to exist in vivo (24), that we refer to as TFIIK. We further illustrate the TFIIK trimer as reversibly dissociable to its constituent polypeptides, with the equilibrium favoring monomers at higher temperature and as a function of temperature-sensitive alleles of Kin28, Ccl1 and Tfb3. It is known that the combination of temperature-sensitive Tfb3 and nonphosphorylatable Kin28 leads to Kin28 proteolysis (25). On the contrary, hnt1 deletion with nonphosphorylatable Kin28 is synthetically lethal on galactose media but hnt1 deletion does not destabilize the Kin28 polypeptide (17). For these reasons, each model hypothesizes that the Kin28 monomer, released from TFIIK at elevated temperature and with mutations in Cak1, Kin28, Ccl1 or Tfb3, is susceptible to inhibition in the absence of Hnt1 enzyme activity, and that the hnt1-inhibited form of Kin28 is disadvantaged in complex formation. Additionally, each model suggests that the hnt1-inhibited form of Kin28 is stable and inhibits growth on galactose. Further, each model hypothesizes that with mutations in Cak1, Kin28, Ccl1 or Tfb3 combined with loss of Hnt1, sufficient Kin28 is drawn down to the poorly complexing form to deplete Kin28 from TFIIH and thus reduce viability on glucose.
The models consider alternate mechanisms by which Hnt1 enzyme activity may regulate Kin28 function. In model A, the small molecule Hnt1 substrate provisionally termed AMP-X is a noncovalent inhibitor of Kin28 and the function of Hnt1 is to reduce the concentration of this molecule in the nucleocytoplasm. In model B, AMP-X is an Hnt1 substrate that covalently modifies Kin28. In model C, AMP-X is an Hnt1 substrate that adenylylates Kin28 as in model B and, additionally, Kin28-AMP is an Hnt1 substrate. Finally, in model D, Kin28 becomes adenylylated by virtue of reaction with ATP and Kin28-AMP is the only Hnt1 substrate. In models A, B and C, small molecules are proposed to be important Hnt1 substrates. In models C and D, adenylylated Kin28 protein is proposed to be an important Hnt1 substrate. Experiments are in progress to test these models.
Ataxia-oculomotor apraxia syndrome (AOA) is an early onset, progressive neurological disease characterized by balance and facial motor problems, cerebellar atrophy and hypoalbuminemia, whose neurological symptoms are difficult to distinguish from ataxia-telangiectasia (26). At the halfway point of a survey intended to catalog every case of hereditary ataxia in Portugal, AOA is second in occurrence only to Friedreich’s among the autosomal recessive ataxias (27). Despite literature that is sparse outside of Portugal (27) and Japan (28), AOA has been found in many ethnic groups and does not appear to be a Portuguese or Japanese disease but rather a disease that is rare but ubiquitous (26). Two groups succeeded in positional cloning of an AOA disease locus to the APTX gene encoding Aprataxin on 9p13 (29, 30).
Human Aprataxin is expressed in two splice forms (30). The minor splice form encodes a predicted polypeptide of 168 amino acids consisting of a Hint domain and an apparent C(2)H(2) zinc finger C-terminal to the Hint domain (29, 30). The Hint domain of Aprataxin shows 31% amino acid identity with rabbit and human Hint in an 86 amino acid stretch that begins with the first beta strand and ends with the HIT motif in the fifth and last strand (7). Though this segment does not include helix 1 and conserved sequences C-terminal to strand 5, it appears to be a minimum Hint domain we predict to dimerize and possess adenosine 5′-monophosphoramidase activity. C-terminal to the Hint domain, Aprataxin has the sequence CX(2)CX(12)HX(3)H that may be a nucleic acid-binding motif consisting of a single zinc finger (31, 32).
The major splice form of Aprataxin is extended 174 amino acids at the N-terminus with respect to the minor splice form. Residues 6 through 102 of the N-terminus of Aprataxin are 41% identical with the N-terminus of human polynucleotide kinase (PNK), an enzyme with 5′-DNA kinase, 3′-phosphatase activity that is located C-terminal to this domain (33, 34). The observation that PNK and Aprataxin share an N-terminal domain suggested that Aprataxin may have a role in single strand break repair (30). Indeed the PNK and Aprataxin N-terminal domain, termed PANT (30), was recognized as a forkhead-associated (FHA) domain by Keith Caldecott (personal communication). FHA domains are phosphothreonine-binding protein association domains that have been found in a variety of proteins involved in responses to DNA damage (35-38).
Many eukaryotes appear to have multiple Hint family members, one or more of which are Hint homologs, and one which is an Aprataxin homolog (Table 1). These sequences can be distinguished by degree of similarity to Hint and Aprataxin. Though Aprataxin homologs only align with Hint sequences inclusive of the β sheet, they are typically larger than Hint homologs because they are extended at the N and C-termini and sometimes by insertions between helix 2 and strand 4. To date, we have only observed PANT/FHA domains in Aprataxin-homologous proteins from vertebrates. Invertebrate Aprataxin homologs, such as S. cerevisiae Hnt3 (17), possess a domain structure similar to the minor splice form of human Aprataxin (30). The molecular targets and pathways involving Aprataxin and its homologs remain to be discovered. With identification of such targets, it is hoped that new light can be shed on brain development and motor coordination.
The observation that Hint homologs are nearly ubiquitous and that they are encoded in the smallest observed bacterial genomes suggested that Hint was present at the cellular root of the tree of life (7). Given the ability of Hint homologs to hydrolyze AMP-NH2 and to hydrolyze AMP linked to a lysine sidechain (17), we speculate that Hint may have evolved as an enzyme to cleanse spontaneously adenylylated proteins, which are potentially a consequence of life at high temperature with high levels of ATP. During evolution, certain proteins may have developed the ability to react in a suicide manner with adenosine nucleotides leading to specific sites of high-occupancy adenylylation. In addition, specific proteins may have become sensitive to noncovalent effects of accumulated Hint substrates. Developments such as these could be responsible for specificity in the consequences of loss of Hint homologs, such as has been observed in yeast. Specificity in function of Hint homologs clearly exists because hnt1 phenotypes are easy to observe in cells that are HNT3+ (17) and because the genetics of AOA indicate that affected individuals have loss of function APTX mutations (29, 30) but ought to contain normal HINT1-3 genes.
As will be discussed below, it appears that early to midway through eukaryotic evolution, a HIT protein underwent mutation and selection to become a Fhit-homologous Branch II enzyme. As illustrated in Figure 2, Fhit substrate specificity differs from that of Hint with Fhit homologs preferring to hydrolyze substrates such as AppppA (39), ApppA (40), γ-(m-Nitrobenzyl) adenosine 5′-O-triphosphate (41), and ApppBODIPY (42). This amounts to an altered specificity for the adenylylation step of the reaction because adenylylation of each of these substrates produces an identical Fhit-AMP intermediate that, like Hint-AMP, is enzymatically hydrolyzed. Thus, if the original enzyme was a Branch I enzyme, the new enzyme was altered to change its adenylylation specificity from attack of AMP-NXH to attack of ApppX.
We have previously observed that a HIT protein may have undergone a tandem duplication with addition of a connecting helix and loss of one active site to produce a GalT-like nucleotidyl transferase (7). As shown in Figure 2, GalT and other Branch III enzymes have substrates that conform to the structure NMP-X, and become nucleotidylated with release of the leaving group. Unlike Hint and Fhit intermediates, nucleotidylated GalT intermediates are stable in 55 M H2O (3, 43). Not being hydrolases, Branch III enzymes enforce dependence of the intermediate reacting with a second substrate, typically phosphate or a phosphorylated hexose, to produce product. A possible basis for the differences between HIT hydrolases and transferases is provided in the mechanistic section below.
Apart from Aprataxin (29, 30), which is a divergent member of the Hint branch, and the branch II and branch III enzymes Fhit (44) and GalT (45), the human genome appears to encode at least two additional Hint homologs (46). As shown in Table 1, HINT2, HINT3 and the putative pseudogene HINTP1 appear to be encoded at 9p11.2, 6q22.33 and 7q35, respectively. Expressed sequence tags (ESTs) and/or cDNAs corresponding to HINT2 and HINT3 have been deposited, indicating more restricted expression patterns than the prototypical HINT (now HINT1) gene at 5q31.2 (47). HINT2, expressed in liver, is predicted to encode a Hint homolog with a 35 amino acid N-terminal extension with respect to the canonical 126 amino acid rabbit and human Hint. HINT3 is predicted to encode a primary translation product of 182 amino acids, extended 31 and 25 amino acids at the N and C-termini, respectively. An ortholog of Hint3 including the C-terminal extension can be found in several other metazoans and these genes are clearly expressed at the mRNA level. The HINTP1 locus is apparently intronless and its expression is supported by no publicly available EST. Conceptual translation of HINTP1 produces a primary translation product of 126 amino acids with the peculiar substitution of two conserved histidine residues, H51R and H110Q. In the absence of evidence for expression of HINTP1, this locus has been annotated as a pseudogene. The physiological basis for the complexity of Hint homologs remains to be determined.
Perhaps the most remarkable sequence related to Hint occurs in birds on the W chromosome. ASW, which has every chemically important residue and nearly all nucleoside-binding residues mutated with respect to Hint, is an apparently catalytically inactive Hint family member encoded in an array of approximately 40 tandem repeats on avian W chromosomes (48, 49) whose mRNA may be stabilized by a unique mechanism (50). A canonical Hint homolog is encoded on avian Z chromosomes (49). Female birds are the heterogametic (WZ) sex while males are the homogametic (ZZ) sex. As the genetic basis for sex determination is not known in birds (51) and ASW is expressed in a temporal and special manner that could relate to feminization (48, 49), it will be important to investigate the function of ASW and avian Hint genetically, biochemically and structurally. Heterodimerization as a mode of ASW inhibition of Hint was suggested in the absence of knowledge that Hint function depends on enzyme activity (48-50). However, because Hint is a dimeric enzyme (7) that shows no cooperativity with respect to substrate (17), it is not obvious how heterodimerization of one inactive subunit with one active subunit would reduce specific activity by more than 50%. In an accompanying paper we suggest three mechanisms by which ASW may plausibly function as a dominant negative form of Hint to promote feminization in avian development (52). The most interesting single feature in the ASW sequence that could make this molecule not only negative but dominant is Gln127, which is substituted for Trp123 in Hint, a residue that is buried across the Hint dimer interface in the interior of the other monomer (7). Our molecular model of the proposed Hint-ASW heterodimer predicts that ASW Gln127 may pull Hint His114 out of its active conformation, accounting for dominant negativity (52).
Branch II of the HIT superfamily consists of orthologs of human Fhit. The first member of this branch to be characterized biochemically (53), cloned (39) and characterized physiologically (54) was aph1, a diadenosine tetraphosphate (AppppA) hydrolase from S. pombe. As aph1 possesses the canonical HHH motif, this enzyme was recognized as a member of the HIT superfamily (39), then populated by Branch I sequences, which were not yet known to encode nucleotide-binding proteins or enzymes. In addition, careful sequence inspection allowed these investigators to recognize that aph1 is distantly related to the HXHXQ-containing AppppA phosphorylases (39), which we now recognized as members of Branch III of the HIT superfamily. Thus, despite absence of statistically convincing sequence similarity and presence of mechanistic difference between hydrolases and phosphorylases, prescient observations made upon cloning aph1 (39) and later expanded in the review literature (55) form the basis for our current classification of three branches of HIT proteins.
Study of aph1-related molecules took on a sense of urgency with the discovery that the gene at the human chromosome 3 fragile site, which is disrupted in a familial chromosomal translocation associated with early-onset renal carcinoma and is lost in many cancers of the digestive tract, is the human aph1 homolog (44). Spanning the chromosome 3 fragile site FRA3B, which is the most fragile site in the human genome, and possessing a HIT motif, the gene was named FHIT for fragile histidine triad gene (44). It is now clear that FHIT deletions are among the earliest and most frequent genetic alterations in development of tumors in environmentally-exposed tissues such as the lung (56, 57). Because the chromosomal location of FHIT is fragile and susceptible to deletions, the null hypothesis stated that losses at the FHIT locus are not contributing causes of cancer but rather consequences of the genome instability of p53- tumors (58). Though the kinetics of FHIT inactivation in the lung are inconsistent with this hypothesis (loss of Fhit protein occurs earlier and more frequently than TP53 mutation (56)), it was critically important to determine whether re-expression of the FHIT gene in fhit- cancer cell lines suppressed tumorigenicity. This has been demonstrated convincingly in several experimental systems (59-66) and has been recently reviewed (67).
The murine fhit gene is encoded on a syntenic portion of chromosome 14 and also spans a fragile site (68, 69). Murine fhit is not an essential gene and fhit-/- mice survive and develop a wide range of tumors including lymphomas and sebaceous tumors at higher incidence than wild-type mice (70). Taking advantage of the fhit gene’s susceptibility to mutational inactivation, it was discovered that treatment of fhit+/- mice with low doses of the carcinogen N-nitrosomethylbenzylamine (NMBA) lead rapidly to development of Fhit- stomach and sebaceous tumors (71). Interestingly, stomach and sebaceous tumors are observed in a subset of hereditary nonpolyposis colon cancer (HNPCC) termed Muir-Torre syndrome that is thought to be due to loss of mismatch repair genes, potentially indicating that loss of FHIT is an important target in HNPCC (71).
Because loss of FHIT in development of lung cancer occurs in preneoplastic lesions (56) and FHIT reexpression in fhit-/- cells suppresses tumor formation by induction of apoptosis (60, 61, 63, 64, 66), introduction of FHIT-expressing viruses has been explored as a preventative strategy in the NMBA-tumor induction model. Remarkably, administration of FHIT via viral vectors shortly after NMBA administration prevents tumor development (65), presumably by killing the cells that suffer their second hit at fhit.
Though FHIT re-expression in fhit- human or mouse cells induces death by apoptosis (60, 61, 63, 64, 66), relatively little can be said about the cellular pathway through which Fhit acts. There are three clues, however. First, the tumor suppressing function of Fhit does not depend on hydrolysis of ApnA (59, 62) but may depend on forming a complex with ApnA (72). Second, there is a Rosetta Stone relationship (73, 74) between Fhit and Nit, a novel member of the nitrilase superfamily (75, 76). Third, Fhit homologs are found through a sufficient swath of eukaryotic evolution to predate many players in developmentally regulated apoptotic pathways and are likely to have a more fundamental cellular function, whose alteration triggers apoptotic consequences in animals.
Fhit homologs from different organisms vary in their substrate specificity for AppppA and diadenosine triphosphate (ApppA) but the enzymes that have been characterized kinetically bind both nucleotides very specifically (42, 76). Whereas aph1 from S. pombe prefers AppppA (53), human Fhit (40) and S. cerevisiae Hnt2 (77, 78) prefer ApppA. Fhit homologs from human and C. elegans do not discriminate significantly between ApppA and AppppA in terms of Km, both enzymes being saturated by both substrates in the low micromolar range (42, 76), and thus Fhit homologous enzymes are thought to discriminate between substrates in the kcat term (42, 76). Moreover, as the deadenylylation step for ApppA and AppppA is not different, these enzymes must discriminate on the basis of the rate of adenylylation of substrates that are bound with nearly identical Km values (42).
ApnA are made in side reactions of certain tRNA synthetases (79) and are the subject of a comprehensive recent review (80). Because the functional consequences of elevated ApnA are not known, we considered three possible connections between the tumor suppressing function of Fhit and ApnA (72). First, elevated ApnA may have a proliferative or anti-apoptotic function that is potentiated by loss of Fhit. Second, ApnA may signal through Fhit in an antiproliferative or pro-apoptotic manner that is lost with loss of Fhit. Third, the tumor suppressing function may be ApnA-independent (72). Because the H96N mutant of Fhit had been shown to be extremely deficient in enzymatic activity (40) and yet functional in tumor suppression (59), it was important to characterize the nature of the biochemical defect. The mutant turned out to be more than 4,000,000-fold reduced in kcat and less than four-fold increased in the Km term with respect to the wild-type enzyme in hydrolysis of ApppA (72). This was clearly inconsistent with the idea that Fhit must reduce the concentration of ApnA to function in tumor suppression. Because the H96N mutant retained a low micromolar Km for ApppA and was found crystallographically to bind nonhydrolyzable ApppA in a manner nearly identical to that of wild-type Fhit, it was argued that the tumor suppressing function of Fhit may depend on forming an ApnA complex (72). Furthermore, the crystallographically defined Fhit dimer surface that becomes filled with two ApnA molecules was proposed as a possible site of effector interactions (72). The critical test of this model depends either on discovery of an effector that binds Fhit-ApnA, potentially in a manner that retards hydrolysis, or showing that Fhit mutants that are defective in substrate-binding are poor tumor suppressors (72). These are both active research areas.
In the course of cloning FHIT-homologous genes from invertebrates, it was discovered that flies and worms encode Fhit in a fusion protein containing a novel ~300 a.a. polypeptide at its N-terminus that is a member of the nitrilase superfamily (75). Analysis of domain fusion events in sequenced genomes suggested that such fusions might be “Rosetta Stones” that decode a previously hidden cellular or biochemical interaction between proteins unrelated by sequence (73, 74). It was argued that fusions between domains that share a common phylogenetic profile (both genes are either represented in or both genes are absent from a genome) and which are coordinately expressed in organisms in which they are not fused are most likely to be indicative of a common cellular pathway (81).
Despite the fact that the “Nit” domain of fly and worm NitFhit had not been previously observed in any mammal, NIT genes were discovered in human and mouse as separate genes from FHIT, and murine nit1 and murine fhit were shown to have nearly identical patterns of tissue-specific mRNA accumulution (75). Furthermore, discovery of homologs of the Nit domain of NitFhit in frogs, fission yeast and budding yeast supported the co-evolution of these polypeptides in the organisms that have Fhit homologs (76). The crystal structure of worm NitFhit showed that Nit is a tetramer that binds two Fhit dimers and that the C-terminus of Nit is conserved to interact with Fhit (76). The Nit-interacting surface of Fhit is not the Fhit-ApnA surface (76) but the architecture of the 200 kDa NitFhit complex is such that it could potentially bind two Fhit-ApnA interacting molecules at opposite poles of the 140 Å tetramer.
The active site of all members of the nitrilase superfamily was inferred from the NitFhit structure to be a Glu-Lys-Cys catalytic triad (76). The nitrilase superfamily consists of at least 13 branches of enzymes, members of only one of which have been shown to possess nitrilase activity (82). Eight branches of the nitrilase superfamily have either demonstrated or predicted amide hydrolase or amide condensation activities for a variety of substrates and domain fusion events have been common in the nitrilase superfamily (82). Biochemical studies have yet to identify a substrate for the Nit active site of NitFhit or a Nit ortholog.
In addition to fungi (53, 77, 78), mammals (44, 68, 69), and invertebrates (75), Fhit homologs can be found in the genomes of microsporidia such as E. cuniculi (83), mycetozoa such as D. discoidium (unpublished), and green plants such as A. thaliana (84). According to the tree of eukaryotic life constructed from protein data (85), unless Fhit is found in plants due to a lateral transfer, the earliest rift that would have preceded appearance of Fhit was the separation of plants from animals, fungi, microsporidia and mycetozoa. It will therefore be important to look for Fhit homologs in organisms such as Tetrahymena, Plasmodium, Trypanosomes and Euglena as they are thought to be on the plant side of the plant/animal rift (85) that potentially preceded the appearance of Fhit. Because several apoptotic players including p53 and Bcl-2 family proteins are in animal-specific pathways, it is hard to reckon the primary function of Fhit being in direct management of apoptosis even though the biological effect of FHIT reexpression in fhit- cells has been apoptotic (60, 61, 63, 64, 66). A role more fundamental to eukaryotic cell biology, such as regulation of protein localization, which could have consequences that result in apoptosis in animal cells, seems more likely.
Branch III of the HIT superfamily consists of GalT homologs and other related nucleotide transferases. The second step in the Leloir pathway for galactose utilization requires galactose-1-phosphate to be transferred to UMP to form UDP-galactose (1). As shown in Figure 2, the reaction is catalyzed by GalT, which consumes UDP-glucose to form a covalent UMP intermediate with His166 (43, 86) releasing glucose-1-phosphate, and then reacts with galactose-1-phosphate to form UDP-galactose. Humans with mutations in the GALT gene are the largest class of galactosemics and are initially diagnosed with failure to thrive in the first days of life. Despite being put on a galactose-free diet, the long-term outcomes for GALT galactosemics include problems with cataracts and intelligence (87).
S. cerevisiae (88) and E. coli (89) GalT sequences were similar enough to guide cloning of human GalT (9). GalT sequences co-existed with Hint sequences (6, 90) in the databases for several years without notice. In fact, the first Hint structure (8) and the first GalT structure (2) were initially considered to be unrelated to any other solved structures before Hint-nucleotide structures were solved (7). Independently, computational approaches based on structural coordinates (10) and on sequence alone (11) were able to detect the similarity.
While the Leloir pathway and thus GalT homologs may be extant in all organisms that utilize galactose as a carbon source (1), at least two other enzymes exist in subsets of organisms that are recognizable by sequence and mechanism to be members of Branch III of the HIT superfamily. S. cerevisiae and some other fungi and bacteria possess AppppA phosphorylases (91-93) that produce ATP and ADP (not AMP) with overall retention of configuration (94). Like GalT, the reaction proceeds through a covalent adenylylated intermediate (95) with release of ATP and uses inorganic phosphate rather than water to phosphorylyze the adenylylated intermediate to produce the ADP product. The three S. cerevisiae Branch III enzymes, Gal7, Apa1 and Apa2, are included in Table 1. Finally, Thiobacillus denitrificans possesses a GalT-related enzyme that is now termed adenylylsulfate:phosphate adenylytransferase whose NMP-X substrate is AMP-SO4 (96). After producing a covalent adenylylated intermediate that was detected by mass spectroscopy, this enzyme reacts with inorganic phosphate to produce ADP (96). Branch III enzymes are recognizable by related sequences that include a HXHXQ motif and mechanisms that depend on transferring the covalently bound nucleoside monophosphate to phosphate or a phosphorylated second substrate, rather than hydrolyzing it with water (Figure 2).
Rotaviruses, which are major causes of diarrhea in humans and other animals, consist of six structural and six nonstructural proteins encoded by eleven segments of double stranded RNA. Nonstructural protein two (NSP2) is a conserved basic protein that has been shown genetically to be required for RNA packaging (97, 98), which possesses intrinsic single stranded RNA-binding activity and magnesium-dependent nucleoside triphosphatase activity for GTP, ATP, UTP and CTP to the corresponding nucleoside diphosphates (99). The crystal structure of the NSP2 octomer revealed that an ~80 residue segment within the C-terminal domain of the 317 amino acid NSP2 monomer adopts a Hint-like three dimensional fold that is superimposible with beta strands 1 through 3, helix 2, and strands 4 through 5 of Hint (100). Though NSP2 is an octomer, NSP2 has no dimer interface that aligns with Hint’s dimer interface at helix 2 and strand 4, nor is there sequence conservation with Hint that would have indicated similarity by descent (100). While His221 of NSP2 can be aligned with His112 of Hint, no other residues important for nucleotide-binding or hydrolysis in Hint are found at corresponding positions in NSP2. The fact that NSP2 hydrolyzes any nucleoside triphosphate in a magnesium-dependent manner at the γ-phosphate while HIT proteins are specific for adenylylation or uridylylation in a cation-independent manner at the α phosphate suggest that NSP2 is not a HIT protein. However, we wish to revisit our tentative exclusion of NSP2 from the HIT superfamily in light of a nucleotide-bound NSP2 crystal structure. If NSP2 has a mode of nucleotide recognition that is nearly isosteric with that of Hint, one might be able to consider NSP2 a distant relative that has substituted a metal and new sequences for conserved residues, lost the characteristic dimer, and altered the site of nucleotide chemistry. For now, we agree that the Hint-NSP2 resemblance is striking but convergent (100), and not mechanistically related.
Truly related sequences not only fold similarly but function similarly so the extensive mechanistic dissection of GalT serves as a model for the entire HIT superfamily. In this regard, it is clear that both the nucleotidylation and the nucleotidyl transferase steps depicted in Figure 2 proceed with inversion of configuration at the α phosphate, leading to overall retention of configuration (101, 102). The GalT mechanism is thus extremely symmetrical: a UDP-hexose brings UMP in for covalent nucleotidylation of His168 producing a hexose-1-phosphate leaving group, and a hexose-1-phosphate retraces those steps to reclaim the nucleotide as part of a UDP-hexose product. The residues involved in recognizing the hexose-1-phosphate leaving/attacking groups have been defined crystallographically and, indeed, glucose and galactose are bound at the same site (4). The key difference between the HIT hydrolases in the Hint and Fhit branches and the transferases in the GalT branch is that hydrolases do not await a second substrate to accept the histidine-bound nucleotide but instead are competent to transfer the nucleotide to water. Might the nearly isosteric nature of the GalT leaving and attacking groups be the basis for GalT-UMP intermediates being resistant to hydrolysis? The existence of AppppA phosphorylases and adenylylsulfate:phosphate adenylytransferase makes us argue against this idea. Both of these enzymes are GalT-related transferases (96) that go through covalent intermediates (95, 96) with known stereochemistry (overall retention) in the case of AppppA phosphorylase (94). Because the AppppA leaving group, ATP, is not isosteric with the second substrate, phosphate, it is difficult to argue that substrate-binding sites remote from the nucleotidylated histidine enforce transfer rather than hydrolysis. We argue instead that Branch III enzymes lack the final histidine residue in the HIT motif, which is necessary to activate water for hydrolysis.
Kinetic studies with O18H2O and with stereospecific α-thio labeled γ-(m-Nitrobenzyl) adenosine 5′-O-triphosphate demonstrated that Fhit hydrolyzes substrates with overall retention of configuration and addition of water to the AMP product (41). Given physical evidence (103) for a covalently adenylylated intermediate and the stereochemical evidence (41) for retention of configuration, it is reasonable to assume that the hydrolytic water comes “in-line” from the position formerly occupied by the α-β bridging oxygen of a Fhit substrate. The fact that Branch I and II enzymes and the non-GalT Branch III enzymes are less symmetrical than GalT (i.e., water or inorganic phosphate as the nucleotide acceptor as opposed to another hexose-1-phosphate) does not make us think that the geometry of hydrolysis or phosphorylysis is any different than the geometry of nucleotide transfer in GalT (4).
Taken this way, some of the catalytic efficiency of Fhit may derive from the enzyme providing the same entry route to a polarized water as it provided an exit route for ADP. In the case of Hint, whose substrates include phosphoramides such as AMP-NH2 and AMP linked to amino acids (17), the hydrolytic water’s final approach to attack the enzyme adenylylate should retrace the initial escape path of the amine leaving group. In both cases, the enzyme has a role in getting the leaving group to leave the confines of the active site, which is the area in which a leaving group can reattack. Altering the ionization state and/or solvating the leaving group is thus important because, by jettisoning the leaving group into bulk solvent, what was microscopically reversible becomes effectively irreversible.
We make these points to disagree with one aspect of the Fhit mechanism proposed as a result of stereochemical analysis (41). According to that proposed mechanism, His98, the final His of the HIT motif, which aligns with Gln168 of GalT, makes a hydrogen bond with an α phosphate oxygen of an ApppA substrate. In our structures of wild-type and mutant Fhit bound to ApppA analog, IB2, which has a phosphorothio substitution on one α phosphate and substitution of a methylene group for the other α-β bridging oxygen (104), we observed His98 to be positioned to interact with the α-β bridging oxygen (72) and not an α phosphate oxygen like Gln168 of GalT (3). The distance between the His98 εN and the α-β bridging oxygen in the co-crystal structure of Fhit H96N with nonhydrolyzable ApppA was 2.5 Å (72), which is in the range of low barrier hydrogen bonds that are frequently found at enzyme active sites (105). As shown in Figure 4, we propose that the final His in HIT hydrolases is positioned to interact with the amine or ADP leaving group and the attacking water as a general acid-base catalyst. The fourth invariant His residue in HIT hydrolases, Hint His51/Fhit His35, is positioned to stabilize the proposed positive charge of Hint His114/Fhit His98 in the ground state. As we interpret our X-ray structures (7, 72), the exit path for the leaving group and the entry path for the hydrolytic water is between Hint His114/Fhit His98 and Hint Asn99/Fhit Gln83. Referring back to Figure 1B, which is a product complex between Hint and GMP (7), the α phosphate oxygen interpreted to result from water addition is the lower left oxygen shown between His114 and Asn99. In the case of Fhit, this mechanism also accounts for the catalytic roles for His35 and His98 in addition to the role of the nucleophilic His96, as defined by mutagenesis (40). Further, in an accompanying paper (52), we describe how alteration of the position of the proposed general acid base catalyst, His114 in chicken Hint, in the proposed Hint-ASW heterodimer (48-50) may reduce the activity of the heterodimer substantially more than two-fold.
Though we have structural but not kinetic evidence for Hint His114/Fhit His98 as a general acid-base, this mechanism is compelling for Hint because the amine product must be protonated to leave, and the incoming water would gain nucleophilicity while restoring His114 to an acid. In the case of Fhit, we hesitated to suggest that His98 protonates an ADP or ATP leaving group because the terminal phosphate has two negative charges in bulk solvent at pH 7. We now consider, however, that donation of a proton from His98 would render the terminal oxygen a poor nucleophile to reattack the α phosphate. In fact, a low barrier hydrogen bond may be so favorable to form that the rate-limiting step in adenylylation may be getting the leaving group to take the proton and go. We have previously observed that substrates with fundamentally unaltered chemical lability such as ApppA, AppppA, ApppBODIPY and GpppBODIPY are cleaved with increasingly poor rates suggesting that slow substrates may prefer to reattack than to exit (42). Ultimately the hydrogen bond network between His35 and His98 is postulated to promote transfer of the His98 Nε proton to the leaving group, rendering the adenylylation reaction not immediately reversible and His98 basic to activate ahydrolytic water.
The most common galactosemic mutation in human GALT, Q188R, corresponds to E. coli mutant Q168R that has been characterized and found to be reduced a million-fold in rates of uridylylation and deuridylylation (106). It will be interesting to characterize a site-directed Q168H mutant in vitro to see whether alteration of the HXHXQ motif to a hydrolase-like HXHXH motif is sufficient to produce some hydrolytic activity. Though deletions and mutations in GALT are not usually lethal once galactose is eliminated from the diet (87), it is possible that mutations in GALT which produce UDP-hexose hydrolase activity have not been observed because of the potentially lethal consequences of consuming UDP-glucose.
In preparing this review, it would be difficult to overstate the value of the collected works of Dr. Perry A. Frey and co-workers on GalT structure and function, and the contributions of Drs. Preston Garrison and Larry D. Barnes on identifying enzymatic activities and the sequence and structural basis for function in the HIT superfamily. Additionally, I thank Drs. Dagmar Ringe, Charles P. Scott, Kay Huebner, Helen C. Pace, Keith W. Caldecott and G. Michael Blackburn for helpful discussions.
This work was supported by grants from the NIH (CA75954 and CA77738).