|Home | About | Journals | Submit | Contact Us | Français|
Alkaptonuria (AKU) is a rare autosomal recessive metabolic disorder, characterized by accumulation of homogentisic acid, leading to darkened urine, pigmentation of connective tissue (ochronosis), joint and spine arthritis, and destruction of cardiac valves. AKU is due to mutations in the homogentisate dioxygenase gene, HGD, that converts homogentisic acid to maleylacetoacetic acid in the tyrosine catabolic pathway. Here we report a comprehensive mutation analysis of 93 patients enrolled in our study, as well as an extensive update of all previously published HGD mutations associated with AKU. Within our patient cohort, we identified 52 HGD variants, of which 22 were novel. This yields a total of 91 identified HGD variations associated with AKU to date, including 62 missense, 13 splice site, 10 frameshift, 5 nonsense and 1 no-stop mutation. Most HGD variants reside in exons 3, 6, 8 and 13. We assessed the potential effect of all missense variations on protein function, using 5 bioinformatic tools specifically designed for interpretation of missense variants (SIFT, POLYPHEN, PANTHER, PMUT and SNAP). We also analyzed the potential effect of splice site variants using two different tools (BDGP and NetGene2). This study provides valuable resources for molecular analysis of alkaptonuria and expands our knowledge of the molecular basis of this disease.
Alkaptonuria (AKU; MIM# 203500) is a rare autosomal recessive disorder affecting 1 in 250,000 to 1 million people worldwide. AKU is caused by deficiency of homogentisic acid oxidase (HGD, EC 184.108.40.206), which leads to the accumulation of homogentisic acid (HGA), since it cannot be converted to maleylacetoacetic acid in the tyrosine catabolic pathway [Fernandez-Canon et al., 1996; Scriver et al., 2001; Phornphutkul et al., 2002]. The clinical findings of AKU result from the reaction of homogentisic acid and its homopolymeric oxidation products, i.e., benzoquinones, with connective tissue components. The benzoquinones form melanin-like polymers in a process that requires a copper-dependent enzyme, homogentisic acid polyphenol oxidase, that is present in mammalian skin and cartilage [Zannoni et al., 1969]. The oxidation of HGA cause darkening of the urine on standing or upon alkalinization (Fig. 1A) [Bodeker, 1859; Virchow, 1866; Scriver et al., 2001] and pigmentation (ochronosis) of cartilage and other connective tissues (Fig. 1B, 1C); patients later develop joint and spine arthritis in their thirties (Fig. 1D) and destruction of cardiac valves. AKU does not appear to reduce the lifespan of affected subjects [Srsen et al., 1985], but there is a high rate of disability, especially late in life [Perry et al., 2006]. Nitisinone, an inhibitor of the enzyme that forms HGA (i.e., para-hydroxyphenylpyruvic acid dioxygenase), lowers HGA production by 95% [Anikster et al., 1998; Phornphutkul et al., 2002; Suwannarat et al., 2005]; a clinical trial of nitisinone for AKU is underway with 40 patients, 20 receiving nitisinone and 20 control subjects (http://clinicaltrials.gov/; identifier: NCT00107783).
AKU patients generally present with either dark urine or early onset arthritis [Martin et al., 1952; Yules, 1954; Smith and Smith, 1955]. In our previous study [Phornphutkul et al., 2002], the diagnosis was based upon dark urine in 55% of 58 cases and upon chronic joint pain in 45%. AKU is definitively diagnosed by detecting gram quantities of homogentisic acid in the urine. This is easily achieved by gas chromatography-mass spectrometry (GC-MS) analysis as part of a urinary organic acid analysis. Other products of tyrosine metabolism are not generally elevated, and urinary amino acids are normal [Neuberger, 1947].
The gene responsible for AKU is HGD, coding for homogentisate dioxygenase (also called AKU or HGO, GenBank NM_000187). It is located on chromosome 3q21-q23. HGD contains 14 exons and covers 60 kb of genomic DNA [Fernandez-Canon et al., 1996]. It encodes a 49,973 dalton, 445 amino acid protein that forms a dimer of two trimers giving rise to a functional hexamer. The crystalline structure of the HGD protein has been resolved [Titus et al., 2000]. The 280 residue N-terminal domain has a central β-sandwich structure flanked by a β-sheet that interacts with the 140 residue C-terminal domain of neighboring subunits. The active site binds iron in a domain defined by His335, Glu341, and His371 side-chains [Titus et al., 2000].
An AKU mouse model (Hgdaku/aku) was identified from an ENU (ethylnitrosourea) mutagenesis screen [Montagutelli et al., 1994]. This mouse harbors a recessive splice site mutation, c.1006+2T>A, in its Hgd gene (GenBank NM_013547) located on mouse chromosome 16 (27.3cM). This mutation leads to skipping of one or two exons [Kress et al., 1999; Manning et al., 1999] and the generation of a premature stop codon [Kress et al., 1999]. Hgdaku/aku mice have high levels of HGA in the urine and plasma but have no signs of ochronosis [Kamoun et al., 1992]. A trial of nitisinone in this mouse model showed a significant reduction of HGA excretion in the urine [Suzuki et al., 1999].
Interestingly, AKU was discovered in some Egyptian mummies. In 1961, Simon and Zorab [Simon and Zorab, 1961] and in 1962, Wells and Maxwell [Wells and Maxwell, 1962] showed, by radiography, that some mummies had suffered of ochronosis and that AKU might have occurred frequently in ancient Egypt. In 1967, Gray presumed that the dystrophic calcification of ochronosis seen on the mummies’ radiographies was an artifact of the embalming process and therefore that AKU was not frequent in ancient Egypt [Gray, 1967]. However, in 1977, Stenn showed that the ochronotic Egyptian mummy Harwa, dating from 1500 b.c., had probably suffered from AKU [Stenn et al., 1977] by demonstrating that the chemical characteristics of the black pigment found on the articular surface of the pelvic bone was identical to air-oxidized homogentisic acid.
Here we present a comprehensive mutation analysis of 93 AKU patients (79 probands and 14 affected relatives) enrolled in a natural history study of AKU at the NIH. In addition, we list all previously published HGD gene mutations. Of 91 human identified HGD potentially disease-causing variants, 22 were novel. Because the pathogenic consequences of variations in DNA sequence cannot always be determined unambiguously, we assessed the potential pathogenicity of all HGD missense and splice site variants using a variety of prediction programs. Hence, in this paper the term mutation will be used when the variant causes a nonsense, a frameshift or modification of the canonical sites for splicing. Potentially disease-causing variation will be used for any variation in the DNA sequence with unclear effect on the protein. However, the terms variant and variation will be used for any sequence modification.
Our patient population included 79 probands (158 independent alleles), their HGD variations and daily urinary excretions of HGA are listed in Table 1. Mutation analysis resulted in a total of 157 total sequence variations of 52 types including 6 insertions and/or deletions that resulted in a frame-shift (3 novel), 7 splice site modifications (3 novel), 2 nonsense mutations (1 novel), a no-stop mutation (novel), and 36 missense variations (14 novel) (Table 2). These variations were scattered over all 14 exons of HGD, with the largest number of variants in exons 3, 6, 8, and 13 (Fig. 2, Table 3).
Note that patients 45, 46, 47 and 83 have Indian ancestry and are the only patients in whom variants 26 (c.365C>T; p.A122V) and 46 (c.504G>C; p.E168D) occurred.
In 72 probands, two candidate disease-causing HGD variations were identified (Table 1). In the 10 cases in which DNA from affected siblings was available (10 sibships, 13 affected siblings), the same variants were found (grey shaded in Table 1). In 4 patients, only one candidate disease-causing variation was detected (4 of 158 alleles, 2.5%, which is significantly lower than other studies (Beltran-Valero de Bernabe et al. 1998 7%, Phornphutkul et al. 2002, 9%); the same variant was found in the case with an affected sibling. In three patients, three candidate disease-causing HGD variations were present (Table 1). No apparent correlation existed between a patient’s genotype and the level of excreted homogentisic acid. Moreover, excretion of homogentisic acid in urine varied within sibships (Table 1).
After the first report that HGD mutations cause alkaptonuria [Fernandez-Canon et al., 1996], other molecular studies described a variety of HGD mutations and polymorphisms, including haplotype and mutation analysis of the HGD gene in 29 previously unstudied AKU patients, identifying 12 novel potentially disease-causing variations, 5 polymorphic sites, and mutation hotspots [Beltran-Valero de Bernabe et al., 1998; Beltran-Valero de Bernabe et al., 1999a]. In 1999, the molecular defects of 30 AKU patients from central Europe were described, including 5 novel HGD potentially disease-causing variations [Muller et al., 1999]. In Slovakia, the country with the highest frequency of AKU, 2 recurrent variations, c.16-1G>A (INV1-1G>A) and p.G161R, were identified in more than 50% of the patients’ chromosomes, indicating that two independent founders contributed to the region’s high prevalence of AKU [Zatkova et al., 2000a; Zatkova et al., 2000b; Zatkova et al., 2003]. In addition, case reports of sequence modifications associated with AKU were described in patients of Japanese [Higashino et al., 1998], Finnish [Beltran-Valero de Bernabe et al., 1999b], Spanish [Rodriguez et al., 2000], Italian [Porfirio et al., 2000; Mannoni et al., 2004], Dominican [Goicoechea De Jorge et al., 2002], Algerian [Ladjouze-Rezig et al., 2006] and other descents [Beltran-Valero de Bernabe et al., 1998; Beltran-Valero de Bernabe et al., 1999a; Felbor et al., 1999; Phornphutkul et al., 2002; Srsen et al., 2002; Grasko et al., 2009] (Table 3).
We combined HGD variants reported in the Human Genome Mutation Database (HGMD) [Stenson et al., 2003] with an extensive literature search for HGD variations. This yielded 69 HGD variants (65 found in HGMD) described as AKU-associated mutations from individuals of diverse ethnicities (Spanish, Slovakian, Japanese, Finnish, French, Italian, German, Dominican, Algerian, Indian, Turkish and other patients from central Europe). Of these variations, forty-eight (69.5%) were missense, 10 (14.5%) splice site, 4 (5.5%) nonsense, 3 (4.5%) insertion, 3 (4.5%) deletion and 1 (1.5%) insertion/deletion.
Here we describe a total of 91 HGD variants (62 missense, 13 splice site, 10 frameshift, 5 nonsense and a no-stop), 52 from our study (22 novels) and 39 previously reported. The frameshift variants, the no-stop variant, and the variants modifying the canonical sites for splicing have a high likelihood to be pathogenic and result in a deleterious HGD protein. However, the pathogenicity of the missense mutations is less clear. To investigate the effect of the 62 HGD missense variants, we evaluated their possible pathogenicity using different bioinformatic prediction tools: BLOSUM [Henikoff and Henikoff, 1992], SIFT [Ng and Henikoff, 2003; Xi et al., 2004], POLYPHEN [Ramensky et al., 2002], PANTHER [Thomas et al., 2003; Thomas and Kejariwal, 2004; Thomas et al., 2006], PMUT [Ferrer-Costa et al., 2005] and SNAP [Bromberg and Rost, 2007; Bromberg et al., 2008] (Supp. Methods). All the results of these different tools are indexed the Supp. Table S1. Based upon the substitution matrix BLOSUM 62, 31 of 62 missense variations (50%) had a value <-2, classifying them as deleterious. Of the remaining 31, 25 (40%) were ambiguous (score between −1 and +1) and 6 (10%) were classified as benign (score > +1) (Supp. Table S1).
Among the 62 candidate missense variations submitted to the SIFT program, 55 (88.5%) were identified to be deleterious with a tolerance index score <0.05, one (1.5%) was ambiguous, and 6 (10%) were benign (Supp. Table S1). SNAP predicted 55 (90%) of these variants to have a non-neutral effect on the protein and 6 (10%) to have a neutral effect. When the same variants were submitted to POLYPHEN, 50 (80.5%) were called probably damaging, 46 based on the alignment and 4 based on their effect on the 3D structure of the protein. Eight (13%) were considered possibly damaging and only four (6.5%) were called benign (Supp. Table S1).
The results obtained using PANTHER and PMUT were less significant. PANTHER predicted 39 variants (63%) to be mutations (subPSEC <-3) and 23 (37%) to be nsSNP. PMUT called only 35 variants (56.5%) pathological, with a neuronal score over 0.5; only 14 (22.5%) had a con dence index over 5. The other 27 (43.5%) were considered neutral (neuronal score under 0.5) and only 10 (16%) had a confidence index over 5 (Supp. Table S1).
We examined the possible pathogenicity of all the splice site variations using different analysis programs, including the splice site prediction tool from the Berkeley Drosophila Genome Project (BDGP) web site (http://www.fruitfly.org/seq_tools/splice.html) [Reese et al., 1997] and NetGene2, available from the Center of Biological Sequence analysis (CBS), http://www.cbs.dtu.dk/services/NetGene2/ [Hebsgaard et al., 1996]. The results of these two programs are indexed the Supp. Table S2. Both programs showed a drastic decrease of the score for 11 (84.5%) of the 13 variants. These variants all affected the canonical site for splicing. Interestingly, for 4 variants, BDGP predicted that the alteration may create a new intronic splice site further downstream in the intron (Supp. Table S2). The two other variants, c.650-56G>A and c.650-17G>A, for BDGP do not modify the score (0.84) and for NetGene2 the NN score does not vary (0.662) or varies only slightly (0.635). Therefore, this seems to predict these two previously reported variants not to affect the splicing site.
We identified three probands (Patients 36, 93 and 94), each with three HGD candidate disease-causing mutations. Patient 36 carried a nonsense mutation p.E168X and a splice site modification, c.1007-2A>T, strongly predicted to affect the splicing by BDGP and NetGene2, since it modifies the AG of the acceptor site. Patient 36 also carried a missense variant p.N149K which was predicted to be deleterious by POLYPHEN, SIFT and SNAP, ambiguous by PMUT and benign by PANTHER. None of these three variants was previously reported.
Patient 93 had three missense variations. One, p.S305F, is universally predicted as deleterious by all prediction tools and has been previously described [Phornphutkul et al., 2002]. POLYPHEN, PMUT, PANTHER and SNAP predicted the second variant in Patient 93, p.Q258P, to be deleterious; SIFT considered it benign. This variant was also previously reported [Phornphutkul et al., 2002]. POLYPHEN, PMUT and PANTHER predicted the third variant in Patient 93, p.E3A, to be benign and SIFT predicted it to be deleterious.
The third patient with three candidate disease-causing mutations, Patient 94, carried a previously reported missense variant p.G161R [Gehrig et al., 1997]. Functional analysis of this variant showed a loss of 99% of HGD enzymatic activity [Rodriguez et al., 2000]. The second variant, which was previously reported [Beltran-Valero de Bernabe et al., 1999a], in Patient 94, c.16-1G>A, represents a strongly predicted splice site modification, since it affects the GT of the donor site. The third variant of Patient 94 was an indel, c.1387_1389delGAGinsTA, which creates a frameshift and is expected to produce a truncated protein, p.M339IfsX368. It is a novel mutation, and was detected in three other unrelated probands in this study (Patients 15, 59, 81 and siblings).
We conclude that for these 3 patients, without further functional studies, it is impossible to indicate the disease-causing variants because all three variants have a potentially deleterious effect on protein function.
Few polymorphisms were described in HGD and were identified in non-AKU population. The database dbSNP reports 7 variants (Table 4) but these include only one synonymous SNP, c.477G>T. The other variants reported are 4 missense variants and 2 frameshifts. Out of the 6 variants that modify the protein sequence, only rs2255543:A>T was confirmed to be a polymorphism. Two of the 4 missense variants, rs28941783:G>A and rs28942100:C>T, were previously associated with AKU [Fernandez-Canon et al., 1996; Gehrig et al., 1997]. It is not excluded that the three other variants are disease-causing. Indeed, the two frameshift variants, rs34214309 and rs35952153, are both truncating the protein (p.V324GfsX3 and p.N337EfsX5) and therefore are likely pathogenic.
During the sequencing of our cohort of AKU patient, we identified 19 polymorphisms (Table 4), only one was in the coding region, rs2255543:A>T. Seven of them have never been reported (Table 4). We looked for some known Copy Number Variations (CNV) on the Database of the Genomic Variants (http://projects.tcag.ca/variation/) [Iafrate et al., 2004] and no CNV has been identified in the coding region nor in the introns.
In 1584, Scribonius first described a child with black urine. In 1859, Bodeker [Bödeker, 1859] and Virchow, in 1866 [Virchow, 1866], reported several cases and introduced the terms alkapton and ochronosis. AKU was one of the four diseases, “inborn errors of metabolism”, Sir Archibald Edward Garrod used to describe “Chemical Individuality” and to explain the idea of inheritance [Garrod, 1902]. Over the past century, the clinical and laboratory features of AKU have been well established, making diagnosis straightforward. However, in sporadic cases a patient can be misdiagnosed with AKU. For example, the use of minocycline, a tetracycline used for treatment of dermatologic or rheumatologic disorders, can induce hyperpigmentation of the auricle and bluish discoloration of the sclera that can be confounded with ochronosis and AKU. Therefore, a definite diagnosis of AKU should be confirmed by quantitative measurement of HGA in the urine [Suwannarat et al., 2004]. In our patients, the mean urinary HGA excreted was 4.8± 2.2 g/day (2.9± 1.2 μmole of HGA/μmole of creatinine) which is 100 time normal excretion [Introne et al., 2007] (Available at http://www.genetests.org).
If two deleterious HGD variants are found in an AKU patient, they are likely disease-causing. However, the term deleterious should be used with recognition of the fact that a certain minimum reduction in enzyme activity is required before clinical disease occurs; a mildly deleterious allele may not be consistent with AKU. It is known that the human liver produces enough homogentisic acid oxidase to convert over 1.5 kg of homogentisic acid per day [Scriver et al., 2001]. Therefore, for a patient to display AKU symptoms, a loss of more than 99% of the enzyme activity is required. Variability in residual HGD enzymatic activities may explain the absence of a clear correlation between genotype and phenotype. This may be explained by the influence of the amount of protein into the diet, but it has not been demonstrated that the diet has a straight effect on the HGA excretion in urine [Phornphutkul et al., 2002; Introne et al., 2007]. Note that the variance of the excretion of the HGA is smaller within affected siblings (common genotype), in average 0.5μmoles HGA/μmoles creatinine, compared to the variance in the unrelated patients (different genotypes), 1.6μmoles HGA/μmoles creatinine.
In the AKU patients with only one identified HGD variant in the coding sequence or exon boundaries, the other disease-causing variant may occur in unsequenced regulatory regions or introns. Mosaicism could also explain the absence of mutation; sequencing DNA from other tissues may detect it. Therefore, if only one HGD variant is found studying the RNA or the protein may help finding mutation. Indeed if mis-splicing occurs it can be explain with a mutation in the intron. If the expression level is different than normal it can be explain by a mutation in regulatory regions. Haplotype analysis can also be performed and comparison of patients with lacking mutation. If two variants with unclear pathogenic significance are identified, the best way to determine the real pathogenicity of these variants resides in a functional assay.
A total of 91 HGD variants (52 from our study, 39 previously reported) in patients with AKU are described in this report, including 62 missense, 13 splice site, 10 frameshift, 5 nonsense and a no-stop (Supp. Tables S1, S2 and S3). Of these, 22 were previously unreported. In our NIH AKU patient cohort, a total of 52 different potentially disease-causing variants were identified. Among these variations, 4 were highly recurrent; p.G161R was detected 10 times, c.174delA 11 times, p.C120F 10 times, and p.M368V 26 times. Some other variations were also detected more than once (2 to 8 times) in our cohort (Table 3). This may be due to the existence of founder effects. It was hypothesized that mutations in HGD are concentrated near or in CCC motifs (or GGG) [Beltran-Valero de Bernabe et al., 1999a]. Indeed, they found 10 of the 29 (34.5%) nucleotide changes in or near a CCC motif. However, our current study shows that only 18 of the 91 (20%) HGD variations are in or near CCC motifs, decreasing the importance of these motifs in HGD mutability.
We tried to predict the potential effects on HGD protein function and/or enzyme activity for all the identified (previously reported and novel) missense variants. For this, five current prediction tools (Supp. Table S1) were employed to estimate their effect. In general, the performances of the prediction tools are estimated between 50% and 80% accurate [Ng and Henikoff, 2006; Bromberg and Rost, 2007]. In the case of AKU, as the crystal structure is known [Rodriguez et al., 2000; Titus et al., 2000] tools that use 3D structure of the protein, such as POLYPHEN and SNAP, may be more reliable. In addition, most missense substitution prediction tools today offer con dence scores or an estimate of the degree to which a substitution is deleterious. For only half of the identified missense variants, all five prediction-tools gave the same pathogenicity prediction (Table 5). Therefore, the creation of a functional assay for all the missense variants is required. In 2000, Rodriguez et al. characterized the effects of different HGD mutations on HGD enzyme activities. But, as mentioned before, a moderate decrease in HGD enzyme activity does not always explain the disease [Scriver et al., 2001]. Thus, correlation between a potential deleterious effect on the protein and the loss of enzyme activity needed to develop AKU is an important issue. Site-directed mutagenesis in mouse models or in human cell lines may be effective techniques to develop such a functional assay.
In sum, the current study provides a valuable resource for the molecular analysis of alkaptonuria. It is often difficult to predict the pathogenicity of each mutation, especially of missense variants. We demonstrate the use of pathogenicity prediction tools, and conclude that for an accurate prediction of the pathogenicity of some HGD variants a functional (enzyme) assay is required.
We thank Chanika Phornphutkul and David Adams for expert advice on this study and for assistance with mutation prediction tools and Maya Tuchman and Carla Ciccone for skillful laboratory assistance.
This work was supported by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
Supporting Information for this preprint is available from the Human Mutation editorial office upon request (humu/at/wiley.com)