|Home | About | Journals | Submit | Contact Us | Français|
Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
Homozygosity or compound heterozygosity for the c.833T>C transition (p.I278T) in the cystathionine beta-synthase (CBS) gene represents the most common cause of pyridoxine-responsive homocystinuria in Western Eurasians. However, the frequency of the pathogenic c.833C allele, as observed in healthy newborns from several European countries (qc.833C 3.3 × 10–3), is ~20-fold higher than expected on the basis of the observed number of symptomatic homocystinuria patients carrying this mutation (qc.833C 0.18 × 10–3), implying clinical underascertainment. Intriguingly, the c.833C mutation is also present in combination with a 68-bp insertion, c.[833C; 844_845ins68], in a substantial proportion of chromosomes from nonhomocystinuric individuals worldwide. We have sought to study the relationship between the pathogenic and nonpathogenic c.833C-bearing chromosomes and to determine whether the pathogenic c.[833C; −] chromosomes are identical-by-descent or instead arose by recurrent mutation. Initial haplotype analysis of 780 randomly selected Czech and sub-Saharan African wild-type chromosomes, employing 12 intragenic markers, revealed 29 distinct CBS haplotypes, of which 10 carried the c.[833C; 844_845ins68] combination; none carried an isolated c.833C or c.844_845ins68 mutation. Subsequent examination of 69 pathogenic c.[833C; −] chromosomes, derived from homocystinuria patients of predominantly European origin, disclosed three unrelated haplotypes that differed from their wild-type counterparts by virtue of the presence of c.833C, thereby indicating that c.833T>C transition has occurred repeatedly and independently in the past. Since c.833T does not reside within an obvious mutational hotspot, we surmise that the three pathogenic and comparatively prevalent c.[833C; −] chromosomes may have originated by recurrent gene conversion employing the common nonpathogenic c.[833C; 844_845ins68] chromosomes as templates. Hum Mutat 28(3), 255–264, 2007. Published 2006 Wiley-Liss, Inc.†
Autosomal recessive cystathionine beta-synthase (CBS) deficiency (MIM# 236200) is the most common inborn error of sulfur metabolism and was first recognized over four decades ago [Mudd et al., 1964]. Untreated CBS deficiency is characterized clinically by skeletal and ocular abnormalities, as well as thromboembolism and mental retardation. However, early treatment by administration of high doses of pyridoxine and/or by methionine restriction combined with betaine administration has proven effective in preventing many complications of this disease [Mudd et al., 2001]. The clinical, biochemical, and molecular aspects of CBS deficiency have recently been reviewed in detail [Kožich and Kraus, 2001; Mudd et al., 2001].
Molecular analysis of the CBS gene in patients with homocystinuria has led to the identification of some 130 different pathogenic mutations, the majority of which are rare and private [Kraus et al., 1999] (www.hgmd.org; www.uchsc.edu/cbs/cbsdata/cgidata.htm). Although several mutations are frequent, only the transition c.833T>C (p.I278T) in exon 8 of the CBS gene [Kraus et al., 1999] has been reported to occur in virtually every studied population of European origin (www.uchsc.edu/cbs/cbsdata/cgidata.htm). The nonpolar Ile278 residue is highly conserved between mammals, Neurospora sp., Anopheles gambiae, Pichia pastoris, and several bacterial species. However, other species such as Xenopus sp., Fugu rubripes, Drosophila melanogaster, Dictyostelium discoides, Caenorhabditis elegans, and Saccharomyces cerevisiae have a smaller valine residue at the analogous position, suggesting that the enzyme may tolerate mild spatial alterations. By contrast, the substitution of the nonpolar isoleucine by the polar threonine is clearly unfavorable since the mutant enzyme is inactivated [Kožich and Kraus, 1992; Shan et al., 2001] as a consequence of misfolding, loss of heme, and protein aggregation [Janošík et al., 2001]. Despite the detrimental impact of the mutation on enzymatic properties, homozygosity or compound heterozygosity for the c.833C allele is consistently associated with a mild clinical phenotype, both in humans and in transgenic mice [Mudd et al., 2001; Wang et al., 2005]. More importantly, carriership of at least one c.833C allele in CBS-deficient patients confers clinical and biochemical responsiveness to vitamin B6 administration, and consequently necessitates a less severe therapeutic regimen [Kožich and Kraus, 2001; Mudd et al., 2001].
At present, the c.833C allele would appear to be the most common pathogenic CBS variant in Western Eurasians suffering from CBS deficiency. Indeed, in symptomatic homocystinuric patients, c.833C constitutes ~25% of all homocystinuric alleles (132/553 alleles as of April 2006; www.uchsc.edu/cbs/cbsdata/cgidata.htm). Further, an unusually high prevalence of heterozygotes for this variant has been observed among Danish, German, Czech, and Norwegian newborns (i.e., 1:71, 1:67, 1:257, and 1:63, respectively) and in healthy Dutch adults (i.e., 1:250) [Gaustadnes et al., 1999; Griffioen et al., 2005; Linnebank et al., 2001a; Refsum et al., 2004; Sokolová et al., 2001].
Only a small proportion of human chromosomes carry the pathogenic mutation c.833C on its own (henceforth referred to as c.[833C; –] chromosomes), whereas a much larger proportion contain a nonpathogenic combination of two mutations (termed c.[833C; 844_845ins68] chromosomes). In the latter chromosomes, the pathogenic effect of c.833C is completely nullified by the downstream insertion of a 68-bp duplicated portion of the intron 7/exon 8 junction (Fig. 1C). Although this insertion creates two intron 7 splice donor sites in close proximity, the splicing machinery strongly favors the use of the more distal splice donor site, thereby removing the upstream segment of exon 8 together with the c.833C mutation while preserving the rest [Romano et al., 2002; Sperandeo et al., 1996b; Tsai et al., 1996]. Consequently, both heterozygotes and homozygotes carrying the nonpathogenic c. [833C; 844_845ins68] chromosomes synthesize normal CBS mRNA molecules lacking the pathogenic r.833U > C mutation; these individuals therefore exhibit neither biochemical nor clinical signs of homocystinuria [Pepe et al., 1999; Tsai et al., 1996]. The nonpathogenic c. [833C; 844_845ins68] chromosomes are very common in sub-Saharan Africa (up to 40% of control chromosomes), less frequent throughout Europe and America (5–10% of control chromosomes) [Franco et al., 1998b; Pepe et al., 1999], and comparatively rare in Asia (0.16–2.5% of control chromosomes) [Song et al., 2001; Zhang and Dai, 2001].
The relatively high frequency of pathogenic c. [833C; –] chromosomes in populations of European descent could have resulted from a number of different mechanisms, including recurrent mutation in a mutational hotspot or gene conversion using the common nonpathogenic c.[833C; 844_845ins68] chromosomes as templates. In addition, positive selection for carriers, random genetic drift or a combination of these mechanisms could have been responsible for boosting the population frequencies of the pathogenic c. [833C; –] chromosomes to their present values. We have adopted several different approaches in an attempt to distinguish between two competing hypotheses, namely: 1) that the pathogenic c.[833C; –] chromosomes are identical-by-descent; or 2) that these chromosomes arose by recurrent mutation, possibly involving c. [833C; 844_845ins68] chromosomes as templates.
All patients in the present study suffered from homocystinuria due to CBS deficiency, and were either homozygous or compound heterozygous for the pathogenic c. [833C; –] chromosomes. Through an international cooperative effort, samples of genomic DNA, blood or fibroblasts were obtained from a total of 54 unrelated subjects, representing 71 independent c.[833C; –] chromosomes. A total of 11 patients each were from the Netherlands and Italy, nine were from Germany, six from Denmark, five from Australia, four from the Czech Republic, two each from Ireland and France, and one each from Poland, Hungary, Japan, and Slovakia. All subjects except the Japanese patient were of European ancestry.
Control individuals facilitating the haplotype analysis of wild-type CBS chromosomes had previously been ascertained in two different studies. Banked genomic DNA samples were obtained from 200 anonymous Czech [Janošíková et al., 2003] and 190 anonymous sub-Saharan African controls (50 subjects each from the Fon, Dendi, and Bariba populations, and 40 individuals from the Berba population) [Pepe et al., 1999]. All controls were apparently healthy individuals with no clinical cardiovascular or connective tissue manifestations.
The study protocol was approved by the Ethics Committee of Charles University-1st Faculty of Medicine. All homocystinuria patients gave their informed consent via their primary care physicians. Control subjects gave their written informed consent during the course of sampling in the original study.
Genomic DNA was isolated from peripheral blood leukocytes or cultured human fibroblasts by column extraction. Genotypes for 12 polymorphic CBS markers (for overview, see Fig. 1A) were determined by fragment analysis of PCR products (C.1444_1467+7(16_21) [known as 31 bp variable number of tandem repeats (VNTR)]), amplification refractory mutations system PCR (ARMS-PCR) (c.699C>T, c.833T>C, c.844_845ins68, and C.1080C>T), PCR-RFLP (c.-3792G>A, c.699C>T, c.833T>C, c.1080C>T, and c.*543T>C [known as C.1985T>C]) and automated DNA sequencing (c.1358+161G>A, c.1358+264A>G, c. 1359–219C>T, c.1359–134G>A, and c.1359-30C>T). The ARMS-PCR and PCR-RFLP techniques were used interchangeably since they yielded identical results. For detailed conditions and suppliers of chemicals and instruments, see Supplementary Table S1 (available online at http://www.interscience.wiley.com/jpages/1059-7794/suppmat). The analysis of c.833T>C and c.844_845ins68 was designed to permit unequivocal phase determination of these two markers. The cDNA sequence numbering was based upon GenBank reference sequence NM_000071.1 (the first adenosine in the initiator ATG was designated as +1), genomic DNA numbering was based upon GenBank reference sequence NC_000021.7 and the numbering of the 31-bp VNTR was based upon a literature report [Lievers et al., 2001].
A physical haplotyping method had to be developed in order to permit haplotype determination in all study subjects because no parental samples were available for phase determination. Since PCR amplification of a 20-kb segment of the CBS gene proved to be unreliable, PCR conditions to amplify a shorter 12.2-kb portion of the CBS gene between exons 5 and 17 were optimized and subsequently employed. Samples that yielded more than two haplotypes (probably due to PCR jumping [Pääbo et al., 1990]) and samples which failed in a 12.2-kb PCR amplification as a consequence of poor template DNA quality, were amplified in two overlapping 7.4-kb and 8.1-kb fragments spanning exon 5 to intron 13 and intron 9 to exon 17, respectively. Detailed conditions for the PCR reactions are given in Supplementary Table S1. The PCR products were gel-purified and cloned into a bacterial vector using the TOPO® XL PCR Cloning Kit (Invitrogen; www.invitrogen.com, Carlsbad, CA) according to the manufacturer's instructions. Individual alleles were then genotyped from plasmid DNA preparations as described above.
Throughout this work, the haplotypes have been described using the following reference haplotype: NM_000071.1: c.[-3792G>A; 699C>T; 833T>C; 844_845ins68; 1080C>T; intron 12 cluster; C.1444_1467+7(16_21); *543T>C]. The intron 12 variants c.1358+161G>A, c.1358+264A>G, c.1359-219C>T, c.1359-134G>A, and c.1359-30C>T were clustered as follows: α, [G; G; C; G; C]; β, [G; G; C; A; T]; γ, [G; G; T; A; T], and δ, [G; A; C; G; C].
Genotypes for seven polymorphic markers were first examined in all 390 control DNA samples, and HAPMAX (available at www.uni-kiel.de/medinfo/mitarbeiter/krawczak/download/index.html) analysis was performed to estimate the possible haplotype structure of control chromosomes. Next, we searched manually for homozygotes carrying the predicted haplotypes. In these samples, genotyping was extended to the remaining five markers by sequencing intron 12. If heterozygosity was observed for more than one locus in the latter analysis, physical haplotyping was employed. Several haplotypes predicted by HAPMAX were, however, individually rare and were not found in the homozygous state. Their existence was therefore also confirmed directly by the physical haplotyping of selected samples.
Genotypes for 12 polymorphic markers were determined in all genomic DNA samples from homocystinuria patients. Of the 54 samples available, two did not amplify in any PCR and seven failed in long-range PCR. The remaining 45 samples were amenable to genotyping and physical haplotyping. CBS haplotypes were determined either directly as a consequence of homozygosity for at least 11 out of 12 markers, or by physical haplotyping.
For selected countries, we were able to estimate not only the population frequency of all homocystinuria-causing CBS alleles combined, but also the CBS alleles bearing the c.833T>C mutation. Since only a limited number of European countries perform long-term neonatal screening for CBS deficiency (and would thus have been able to provide reliable incidence data for deriving these estimates), we simply used the population prevalence of CBS deficiency to extrapolate allele frequencies. Data were first collected on the number of all known CBS-deficient patients and on their genotypes for each of the countries involved using a combination of three approaches: 1) a questionnaire-based survey, with 11 laboratories approached and eight replies received; 2) exploitation of the CBS Mutation Database (www.uchsc.edu/cbs/cbsdata/cgidata.htm); and 3) literature searches. Owing to its much greater ethnic heterogeneity, no attempt was made to perform this analysis in the United States. Population sizes were taken from the United Nations World Population Policies 2003 (www.un.org/esa/population/publications/wpp2003/Publication_index.htm). For crude data, see Supplementary Table S2. Assuming Hardy-Weinberg equilibrium, the total frequency of disease-causing CBS alleles was calculated as the square root of the prevalence of CBS deficiency. Then, the frequency of the c. [833C; –] chromosomes was estimated by multiplying this number by the proportion of c.[833C; –] chromosomes observed among patients. Finally, the combined population frequency of disease-causing CBS chromosomes in Europe was calculated from summary data on the numbers of patients and the sum of respective population sizes of these countries.
Maximum likelihood estimates of haplotype frequencies were obtained from genotype data of unrelated individuals using the HAPMAX computer program. Confidence intervals for the disease prevalence and allele frequency estimates were calculated using the Wilson score method without continuity correction [Newcombe, 1998].
First, we genotyped a set of seven polymorphic markers in all 390 control DNA samples from one European and four African populations; the African populations were included so as to provide the potential to acquire additional information on identical-by-descent c.833C-bearing chromosomes. Subsequently, five additional markers in intron 12 were genotyped in selected samples in order to determine their CBS haplotypes. The frequencies of most of the genotypes differed quite markedly between Czech and African controls (for details see Table 1). The frequencies of the c. 1080C>T and c.*543T>C variants were substantially higher in the Czech population whereas the frequency of the c. [833C; 844_845ins68] double mutation was considerably lower than in Africans. Importantly, we did not detect any c.833 Tchromosomes carrying only the c.844_845ins68 lesion (i.e., c.[833T; 844_845ins68]), nor did we detect any chromosomes bearing exclusively the c.833C mutation (i.e., c.[833C; –]). Thus, the c.844_845ins68 variant was in absolute linkage disequilibrium with the c.833C mutation in all population samples analyzed.
Next, haplotypes were constructed either from homozygous genotypes or by performing physical haplotyping, employing all 12 polymorphic markers. Haplotypes involving intron 12 markers c.1358+161, c.1358+264, c.1359–219, c.1359–134, and c.1359–30 could be divided into four different clusters, which we have termed α [G; G; C; G; C], β [G; G; C; A; T], γ [G; G; T; A; T], and δ [G; A; C; G; C], respectively.
Our study revealed a total of 29 and 18 different wild-type CBS haplotypes, respectively, depending upon whether or not the VNTR c.1444_1467+7(16_21) genotype was included (for details see Table 2). Chromosomes carrying the c.[833C; 844_845ins68] double mutation were found in both populations. European chromosomes were found to harbor only two or four haplotypes, respectively, depending upon whether or not VNTR heterogeneity was considered, while the ancestral African chromosomes displayed much greater haplotype variability. We attempted to compare the haplotype structure observed in our cohort with those generated by the HapMap Project (www.hapmap.org). Although HapMap release 19/phase II from October 2005 contains an array of CBS haplotypes, no direct comparison was possible owing to the use of a completely different set of CBS SNPs by the HapMap project.
The population frequency of patients with clinically ascertained and biochemically confirmed CBS deficiency varied considerably between countries, averaging ~1 out of 1,500,000 inhabitants in Europe and yielding an estimated population frequency of pathogenic CBS alleles of 0.82 × 10−3. The population frequency of the c.833C allele, as estimated from the number of homocystinuria patients carrying at least one c.[833C; –] chromosome, varied by an order of magnitude between the 13 European countries, with the highest prevalence being observed in northern Europe. The calculated frequency of this allele in Europe (qc.833C 0.18 × 10−3) was ~20-fold lower than actually observed in healthy controls—mostly unselected newborns—from several European countries (qc.833C 3.3 × 10−3) (see Table 3). This discrepancy is suggestive either of the decreased antenatal viability of c. [833C; –] homozygotes, the premature death of patients, or, rather more plausibly, an ascertainment bias with respect to patients having a clinically less apparent and milder pyridoxine-responsive form of the disease due to c.[833C; –] homozygosity.
Haplotypes of c.[833C; –] chromosomes and their frequencies were determined by a combination of physical haplotyping and statistical analysis. Physical haplotyping of two-thirds of the available mutant chromosomes revealed three distinct and unrelated haplotypes ([G; C; C; –; C; δ; 21; T], [G; C; C; –; T; δ; 18; C], and [G; T; C; –; C; δ 19; C]) that differ from their respective wild-type counterparts only by the substitution of c.833C for c.833T. To reveal additional haplotypes that might have been present in samples that were not amenable to physical haplotyping, a maximum likelihood analysis was performed on the genotypes of 52 homocystinuric patients (representative of 69 c.[833C; –] chromosomes). As is evident from Supplementary Table S3, HAPMAX revealed the same three c.[833C; –] haplotypes as before, with frequency estimates that were virtually identical to those determined by physical haplotyping. Interestingly, the haplotype containing the δ cluster of intron 12 variants has been previously observed in patients of German origin (designated the “A1 haplotype” in Linnebank et al. [2001b]). We were unable to detect any c.833C-bearing haplotype containing cluster β of intron 12 variants (designated the “B1 haplotype” in the Linnebank et al. [2001b] study), and with hindsight this may have been due to different although partially overlapping sets of German patients having been included in the original and the present study.
The distribution of different pathogenic c. [833C; –] haplotypes varied between different European countries (see Fig. 2), reflecting the complex population history of the mutant CBS chromosomes. It is important to stress that we did not observe any heterogeneity in the number of VNTR repeats for any of the mutant haplotypes, consistent with a rather recent occurrence of all mutant c. [833C; –] chromosomes. The [G; T; C; –; C; γ; 19; C] haplotype was present in 8 out of 9 European countries tested, suggesting that it might be the most ancient one. By contrast, [G; C; C; –; T; δ; 18; C] was restricted to the northern/central part of Europe, while [G; C; C; –; C; δ; 21; T] was confined to Italy. This variable geographical distribution suggests either different migration patterns or the more recent occurrence of the latter two mutant haplotypes.
There is no obvious reason to suppose that thymidine in position c.833 is hypermutable per se. Moreover, the immediate vicinity of the nucleotide c.833 does not contain any repetitive sequence elements that could have templated the c.833T>C transition and hence accounted for its independent occurrence on distinct haplotypes.
However, analysis of the regions flanking the adjacent nucleotide c.844, at which the 68-bp duplication occurs, revealed the presence of two sequence elements with mutational potential (see Fig. 1C). The first, at position c.834_839, was a 5′-TGGGGT-3′ sequence that matches the DNA polymerase α/β frameshift hotspot consensus sequence 5′-TG(G/A)(A/G)(G/A)(C/T)-3′ [Abeysinghe et al., 2003]. The second was an inverted repeat ATCCA/TGGAT, flanking the 68-bp insertion site that appears to be capable of forming an imperfect hairpin loop that could have mediated the exclusion of the inserted sequence while still preserving the pathogenic c.833C mutation on the same chromosome.
In summary, analysis of mutational hotspots and the haplotypes of c.833C-bearing chromosomes suggests a complex history for their generation, with neutral c. [833C; 844_845ins68] chromosomes having potentially templated the generation of mutant c.[833C; –] chromosomes via several mechanisms (see Discussion).
In this study, we have presented evidence to support the view that the most common homocystinuria mutation c.833T>C (p.I278T) occurred both repeatedly and independently during the recent history of European populations. This assertion is based mainly upon the observation of three unrelated CBS haplotypes containing solely the pathogenic c.833T>C substitution. Since the recurrence of the c.833T>C mutation cannot be explained by any known mutational hotspot in the vicinity of c.833T, a different mutational mechanism should be considered.
The high prevalence of the neutral c.[833C; 844_845ins68] chromosomes in Europe prompted us to propose that these nonpathogenic chromosomes may have templated the repeated generation of pathogenic c. [833C; –] chromosomes. Different mutational mechanisms such as meiotic recombination, loop formation with subsequent excision, or gene conversion could have been responsible for recurrently converting wild-type chromosomes into pathogenic c.[833C; –] ones, using neutral c. [833C; 844_845ins68] chromosomes as templates.
Meiotic recombination as a mechanism for repeated mutagenesis appears unlikely owing to the close physical proximity of nucleotides 833 and 844. A meiotic event should have generated a c.844_845ins68 chromosome lacking the c.833C mutation. In 15 published studies containing data on the phase of the c.833C mutation and the c.844_845ins68 variant, a total of 10,074 CBS chromosomes were genotyped and 1,721 c. [833C; 844_845ins68] chromosomes were found. However, not a single c.833 T chromosome carrying only the c.844_845ins68 variant (i.e., c.[833T; 844_845ins68]) has been reported, indicating complete linkage disequilibrium of the c.844_845ins68 variant with c.833C [Aras et al., 2000; Dilley et al., 2001; Dutta et al., 2005; Fillon-Emery et al., 2004; Franco et al., 1998a, 1998b; Giusti et al., 1999; Griffioen et al., 2005; Janošíková et al., 2003; Orendáč et al., 1999; Pepe et al., 1999; Sokolová et al., 2001; Tsai et al., 2000, 1999; Zoossmann-Diskin et al., 2004].
The inverted repeat flanking the 68-bp insertion could, however, have templated the recurrent conversion of neutral c. [833C; 844_845ins68] chromosomes into pathogenic c.[833C; –] chromosomes via loop formation and excision. To explore this second hypothesis, we compared both the pathogenic and neutral haplotypes harboring c.833C. The two major European pathogenic haplotypes, i.e., [G; T; C; –; C; γ; 19; C] and [G; C; C; –; T; δ; 18; C], are unrelated to any of the neutral c.[833C; 844_845ins68] haplotypes. Therefore, their emergence by loop excision from any of the known neutral c.[833C; 844_845ins68] chromosomes would appear to be inherently unlikely. By contrast, the rare pathogenic haplotype [G; C; C; –; C; δ; 21; T] differs from the nonpathogenic haplotype [G; C; C; ins; C; δ; 17/19; T] only by virtue of the absence of the insertion at position c.844_845 and by the number of VNTR repeats. A complex, albeit still feasible, mechanism involving both loop excision and DNA polymerase slippage in the VNTR locus could have led to the formation of pathogenic [G; C; C; –; C; δ; 21; T] chromosomes. However, the putatively templating chromosome was only found in sub-Saharan African controls while the corresponding pathogenic haplotype was only present in Italy.
The third hypothesis invoking gene conversion, however, is favored by the observation that the c.833T>C mutation abuts the DNA polymerase α/β frameshift hotspot consensus sequence 5′-TG(G/A)(A/G)(G/A)(C/T)-3′, known to be involved in strand breakage. Moreover, all three mutant c. [833C; –] haplotypes have a wild-type c. [833T; –] counterpart, consistent with their emergence via gene conversion. In addition, the VNTR heterogeneity characteristic of the non-pathogenic c. [833C; 844_845ins68] chromosomes is lacking on mutant c. [833C; –] chromosomes. This is consistent with the view that the neutral c. [833C; 844_845ins68] chromosomes preceded the mutant c. [833C; –] chromosomes in evolutionary time. We therefore propose that, in individuals heterozygous for the wild-type c.[833T; –] and c. [833C; 844_845ins68] chromosomes, a double-strand break of the c. [833T; –] chromosome at the DNA polymerase α/β frameshift hotspot consensus sequence may have occurred. Subsequent strand invasion, formation of Holliday junctions and excision repair of an incomplete loop could then have formed the three mutant c.[833C, –] haplotypes (the proposed mechanism is depicted in Fig. 3). Gene conversion has been implicated in a number of different human diseases [Patrinos and Grosveld, 2003] and we propose that homocystinuria due to the c.833T>C transition may be yet another example.
A wide variability in the frequency of pathogenic c.[833C; –] chromosomes obtained from known homocystinuria patients in different populations is evident; northern Europe has the highest frequency while Japan has the lowest and the Iberian peninsula appears to be entirely devoid of c. [833C; –] chromosomes. If the nonpathogenic c.[833C; 844_845ins68] chromosomes did indeed template the creation of mutant c.[833C; –] chromosomes by gene conversion, one would expect that the population frequency of these mutant chromosomes would be positively correlated with the prevalence of the nonpathogenic ones. Consistent with this prediction, the low prevalence of c.[833C; 844_845ins68] chromosomes in Asia is indeed associated with the low prevalence, or even absence, of the c. [833C; –] chromosomes among Japanese [Katsushima et al., 2006] and Korean [Lee et al., 2005] homocystinuria patients, whereas a much higher prevalence of both the pathogenic and neutral chromosomes is apparent in Europe. Unfortunately, the assessment of this correlation could not be extended to sub-Saharan Africa due to a lack of reliable data on the frequency of homocystinuria in countries from this region, and because the control sample size in our study was probably too small to detect c.[833C; –]+[833T; –] heterozygotes.
It is at present unclear how much of the variability in the prevalence of the pathogenic c. [833C; –] chromosome can be attributed to the above mentioned mutagenic mechanisms as opposed to random drift, migration, and selection. It is also unclear to what extent this variability is due to the negative clinical ascertainment bias of mildly affected patients carrying the pathogenic c. [833C; –] chromosome (homozygotes and compound heterozygotes for this allelic variant manifest a pyridoxine-responsive homocystinuria with a milder clinical phenotype and such individuals may have not invariably been diagnosed as having CBS deficiency). This possibility is supported by an increasing number of reports of c.[833C; –]+[833C; –] homozygotes and of c. [833C; –]+[other] compound heterozygotes who suffer from an unusual form of the disease, manifesting only a thrombotic diathesis without affecting connective tissue or the central nervous system [Gaustadnes et al., 2000, 2002; Linnebank et al., 2003; Maclean et al., 2002].
The phenotypic expression of the c.833T>C mutation in heterozygotes is essentially unknown [Guttormsen et al., 2001; Sperandeo et al., 1996a]. It is also unclear whether females heterozygous for the pathogenic c.[833C; –] chromosomes may have had a more favorable pregnancy outcome (reduced bleeding potential)—as proposed in cases of thrombophilia [Gopel et al., 2001; Lindqvist et al., 1998]—under the as yet unproven assumption that heterozygotes for CBS deficiency are in general more prone to thromboembolism [Mudd et al., 1981; Swift and Morrell, 1982]. It remains possible that the general living conditions could have been more disadvantageous to heterozygotes in Africa than in Europe, so that the question of whether or not the high frequency of c.833C in Europe may have been in part due to selection remains open.
Our study has revealed a considerable haplotype diversity of wild-type CBS chromosomes in the studied European and African populations. Were the VNTR polymorphism also to be considered, the number of haplotypes common to both populations would be 14. Czech controls carried six population-specific haplotypes, whereas controls from Africa exhibited nine unique haplotypes. These findings are consistent with the more recent emergence of European populations as compared to the ancestral populations originating in Africa.
In conclusion, our study has demonstrated significant worldwide differences in the frequency of disease-causing c. [833C; –] chromosomes, which are associated with three unrelated CBS haplotypes in populations of predominantly European origin. We propose that a complex evolutionary process was responsible for the formation of these haplotypes and that the common neutral c. [833C; 844_845ins68] chromosomes might have been a source of pathogenic c.[833C; –] chromosomes by mechanisms that probably involved gene conversion.
We thank Ms. E. Richterová for technical assistance and Dr. Petra Zavadakova and Ms. Lenka Mrázová for helpful discussions. V.K. was supported by the Wellcome Trust International Senior Research Fellowship in Biomedical Science in Central Europe 070255/Z/03/Z, and in part by research project MSM0021620806 of the Ministry of Education and Youth of the Czech Republic; J.P.K. was supported by NIH grants PO1HD0805 and HL65217, and a grant from the Jerome Lejeune Foundation.