|Home | About | Journals | Submit | Contact Us | Français|
Differences exist among various populations with regards to hypertension prevalence, severity, progression and response to therapy. Such differences may be due to genetic or environmental factors. We characterized the genetic variation and haplotype diversity of four hypertension candidate genes (CLCNKA, CLCNKB, BSND, NEDD4L) in four different ethnic groups (Caucasian Americans, African-Americans, Han Chinese, and Mexican-Americans).
We genotyped 42 single nucleotide polymorphisms across the four genes in equal numbers of each ethnically defined population, then tested for linkage disequilibrium, computed allelic and haplotype frequencies, and compared data across the different ethnic groups.
We identified significant genotype and allele frequency differences among ethnic groups. The strongest differences were observed between African-American and Mexican-Americans and between Caucasian and Mexican-Americans. In addition, haplotype blocks were defined for BSND, CLCNKA_B and NEDD4L in the four populations examined. Completely mismatched (‘yin yang’) haplotypes were also observed. We found that the number of inferred halpotypes varied gene to gene and in some instances between the populations for a given gene indicating substantial haplotype diversity. The haplotype diversity among the various ethnic populations observed in our study was greater than that reported in Perlegen database.
Haplotype diversity in hypertension candidate genes has important implications for designing and evaluating candidate gene or genome-wide blood pressure association studies that consider these genes.
The etiology of essential hypertension, which affects more than 60 million individuals in the United States, stems from a complex interaction of environmental and genetic factors . However, not all populations have the same risk for developing hypertension [2,3,4]. For example, as compared with Caucasians, African-Americans have a higher prevalence of hypertension, suffer a disproportionate morbidity from the disease, and more often exhibit salt-sensitivity . Identifying genetic factors that contribute to differences in hypertension risk will improve our understanding of this disease and may lead to improved clinical management.
Long-term control of blood pressure is mediated predominantly by maintenance of salt balance by the kidneys. This physiological control is enabled by renal tubular transport of Na and Cl. To date, most monogenic forms of abnormal blood pressure control involve genes that converge on this final common pathway, altering blood pressure by changing net renal salt reabsorption . Essential hypertension is genetically more complex but may also represent subtle defects in regulating renal salt balance.
In the kidney approximately 30% of NaCl reabsorption takes place in the thick ascending limb of the loop of Henle (TAL) and this depends upon a chloride channel complex encoded by CLCNKB and BSND which mediate cell Cl efflux. The importance of these genes is illustrated by loss-of-function mutations causing Bartter syndrome which is characterized by severe salt-wasting, hypotension and metabolic alkalosis. A related chloride channel gene (CLCNKA) may also contribute to this inherited salt wasting condition [5,6,7]. Evidence also suggests that a ubiquitin ligase protein encoded by NEDD4L is an important regulator of renal tubular salt transport and further interacts with the TAL chloride channel complexes. Genetic variations in these genes could predispose certain populations, especially salt-sensitive subjects, to develop hypertension. Indeed, CLCNKA, CLCNKB, BSND and NEDD4L have all been proposed as candidate genes for hypertension due to their role in renal physiology and pathophysiology, but none have been extensively examined for polymorphisms or subjected to haplotype analysis.
In this study we examined allele, genotype and haplotype frequencies of these genes at 42 SNPs in four distinct ethnic populations. Understanding the haplotype and linkage disequilibrium (LD) structure of these genes in various populations will assist in evaluating candidate gene or genome-wide association studies that examine CLCNKA, CLCNKB, BSND, and NEDD4L in the setting of abnormal blood pressure.
The study subjects were obtained from a large panel of anonymous, unrelated DNA samples from the Human Variation Collection of the NIGMS Repository held by the Coriell Institute. We specifically used sets of DNA samples obtained from four distinct ethnic groups residing in the United States including Caucasian Americans, African-Americans, Han Chinese of Los Angeles and Mexican-Americans. Using Ensembl (www.ensembl.org), we defined regions of each gene of interest within ENsMart (www.ensembl.org/EnsMart). We examined exons (coding regions, 5′UTR and 3′UTR), introns and 10 kb flanking of the genes. Since CLCNKA and CLCNKB are only 11 kb apart, they were considered as one gene (CLCNKA_B) in this study.
Once a region of interest was identified, we then searched the database for SNPs with minor allele frequency (MAF) >5%, particularly searching for non-synonymous, coding SNPs. We did not find many such variants due to either low frequency or, as in the case with BSND, no coding region polymorphisms were reported in the literature at the time we began our study. Using the Applied Biosystems (ABI) design pile line, our final SNP selection for the study involved SNPs with MAF >5% and SNPs selected from the literature or public databases. Given the LD pattern in these genes, our analyses should provide adequate coverage to gain an excellent understanding of the patterns of variation in each of these genes. Table Table11 provides detailed information about each SNP we studied. We genotyped 42 SNPs across 4 genes in equal numbers (n = 85) of each population using SNPlex genotyping platform (Applied Biosystems). Genotyping products wereanalyzed using either an ABI 3730 DNA analyzer or an ABI Prism 7900 sequence detection system. Quality control measures to ensure genotyping accuracy included eight wells of ABI positive control, eight wells of no template control and sixteen wells of ABI allelic ladder.
Statistical tests for differences in single locus allele and genotype frequencies between ethnic groups and deviations from Hardy Weinberg Equilibrium (HWE) were performed using Powermarker statistical software . Statistical significance for these analyses was determined using Fisher's exact test. In the case of significant deviations from HWE, the direction of the deviations were assessed using Selander's index, with negative values indicating a deficiency of heterozygotes and positive values an excess . Each SNP was assessed for allelic variation within populations as compared to between populations using Wright's fixation index (Fst) calculated using the algorithm of Weir and Cockerham . Fst analyses were performed using software based on the protocols of Raymond and Rousett 1995 (Tools for Population Genetic Analysis, TFPGA, and version 1.3 available at (http://bioweb.usu.edu/mpmbio/index.htm).
Pairwise linkage disequilibrium (LD) was characterized and haplotype frequencies were calculated for all ethnic groups using Powermarker  and HaploView  statistical software. Standard summary statistics D′ and r2 were calculated using HaploView . Haplotype blocks were assigned using the D′ confidence interval algorithm . Both Powermarker and HaploView use an EM algorithm to determine haplotype frequency distributions when phase is unknown. The Powermarker haplotype trend regression (HTR) analysis was performed to test for haplotype frequency differences between ethnic groups. The test for association then uses an F test for a specialized additive model. This approach can be applied to both quantitative traits and dichotomous traits.
Individual allele frequencies in four different ethnic populations are tabulated in table table2.2. The genotype differences mirrored the allele frequency differences. Statistically significant deviations from Hardy-Weinberg equilibrium (HWE) (table (table2)2) were seen in BSND marker rs2864124 in Mexican Americans (p = 0.0002), CLCNKA_B markers rs2050522 in African Americans (p = 0.001) and Caucasians (p = 0.01), rs522155 in Caucasians (p = 0.05), rs158857 in Mexicans (p = 0.04), rs11152064 in African Americans (p = 0.02) and rs4940684 in Caucasians (p = 0.0007). Caucasians and Mexican Americans had very similar MAFs in CLCNKA_B and NEDD4L, with more variability in BSND. For rs2864124, rs2050522, rs4940684, and rs11152064 the deviations from HWE were due to a deficiency of heterozygotes with Selander's index values of −0.39, −0.44, −0.36 and −0.19, respectively. For both rs522155 and rs158857 the Selander's index values were 0.23, indicating an excess of heterozygotes.
Allele frequency analysis revealed significant differences in frequency distributions among the four different ethnic groups (table (table3).3). Notably, African Americans and Caucasians had few statistically significant differences (p < 0.05) in either the allele or genotype level in BSND with the exception of marker rs1003767. Other ethnic group comparisons, however, seem to have statistically significant differences at most markers in BSND. The greatest differences between ethnic groups were between Caucasian and Mexican Americans, and between African-Americans and Mexican Americans. Marker rs2864124 (BSND) also exhibited statistically significant allele and genotype frequency differences when the comparison included Mexicans which had a much smaller MAF than the other ethnic groups (minor allele frequency 0.18).
Within CLCNKA_B,genotype analysis was similar for African-Americans and Caucasians with 3/8 SNPs showing significant differences. Similar to BSND genotype, other ethnic group comparisons had statistically significant differences at 6–7/8 markers within CLCNKA_B except for the Caucasians and Mexicans Americans comparison where there were no differences at any of the markers.
Caucasians and Mexicans Americans also did not have as many statistically significant genotype or allele frequency differences in the NEDD4L gene as the other comparisons, with only three of the markers demonstrating significant differences. A trend of note is that markers within two regions in NEDD4L were less frequently different between ethnic groups (region 1, chromosome position 53876954–53990160 bp: rs1942563, rs4461163, rs1977948, rs878396, rs6566942 and region 2, 54046768–54170475 bp: rs192659, rs158866, rs158857, rs11152064, rs4941378, rs17064701, rs11664416, rs8089678, rs9675944).
In order to assess the degree of similarity in genetic structure among the different populations, we calculated Fst for each SNP common across all populations (fig. (fig.2).2). Fst values range from 0 to 1 and increases as the allele frequency difference between populations increases; a value >0.05 is typically considered a significant difference in allele frequencies between the populations. Five of 8 BSND SNPs were greater than 0.05, while 8/8 of the markers in CLCNKA_B were greater than 0.05 with rs2050522 having an extremely large peak at 0.5 because it was invariant in the Han Chinese. Fifteen of 21 SNPs had substantial differences between ethnic groups in NEDD4L with several of the markers having Fst values >0.07.
Haplotype blocks were defined in BSND, CLCNKA_B, and NEDD4L in the four populations (fig. (fig.3,3, ,44 and and5).5). BSND exhibited an LD block (1 kb), common to African Americans, Caucasians and Mexican Americans that included markers rs6682884, rs2864124 and rs943645 (fig. 3a, b and d). These markers were also in strong LD in the Han Chinese population, but did not define a block. Instead, another LD block (10 kb) was identified in the Han Chinese population that included rs103767, rs6682884 and rs2864124. Two of the markers (rs6682884 and rs2864124) had strong LD and were within haplotype blocks in all populations. CLCNKA_B (fig. (fig.4)4) exhibited more variable LD, with Caucasian, Han Chinese and Mexicans populations showing strong LD. One block (24 kb) which included markers rs1739840, rs1763597, rs945425 and rs20039443 was common to these populations. Two of these markers defined a block (1 kb rs1739840 and rs1763597) within African Americans which exhibited weak linkage disequilibrium. NEDD4L (fig. (fig.5)5) had several non-contiguous SNPs in LD with each other in several of the populations. One block was common to all of the populations examined (2 kb includes markers rs6566942 and rs4340401), another block common to African Americans and Caucasians (2 kb includes markers rs1942563, rs4461163 and rs1977948), and another block was common to Caucasians and Mexicans (21 kb includes markers rs513563, rs522155 and rs192659). Several of the markers were invariant in the Han Chinese population (rs1942563, rs4461163, rs1977948 and rs17064701).
The haplotype frequencies in each group are reported for each gene (tables (tables44 and and5).5). BSND and CLCNKA_B haplotype analyses were run with all common markers, while NEDD4L haplotype analyses were run with markers with Fst values >0.07. The reason that only NEDD4Lmarkers with Fst values >0.07 were examined is because Fst analyses suggest that these markers best differentiate these populations and this approach reduces the number of possible haplotypes for this large gene. Haplotype blocks were identified in BSND (rs6682884, rs2864124 and rs28674124) with only ATC and CCT being present; in CLCNKA_B the haplotype for rs1739840 and rs1763597 revealed GA as having the larger frequency in African Americans, Caucasians and Han Chinese and AG having the larger frequency in Mexicans. NEDD4L also had a block (rs6566942 and rs4340401) with a larger frequency of AC across populations. Analysis of haplotype frequencies between pairs of populations revealed statistically significant haplotype frequencies for all comparisons (p <0.001) except for NEDD4L Caucasians vs. Mexicans and a marginally significant difference for CLCNKA_B Caucasians v Mexicans.
The ‘yin yang’ haplotypes are defined as two high-frequency haplotypes composed of completely mismatching SNP alleles, i.e., nucleotides differ at every SNP in the haplotype pair . We detected such haplotype phenomenon in all the genes studied (table (table4).4). In the BSND gene, one pair of ‘yin yang’ haplotypes, ATC and CCT, was observed with overall frequencies of 57–78% and 14–38%, respectively in the various groups. A pair of ‘yin yang’ haplotypes, GA and AG, was also observed in the CLCNKA_Bgene with frequencies of 33–78% and 11–51% respectively in the various groups. Lastly, we observed ‘yin yang’ haplotype in NEDD4L, AC and GT, with frequencies of 25–61% and 2–32% respectively among the four populations.
In this study we examined allele and genotype frequencies, LD and haplotypes among African, Caucasian Chinese, and Mexican Americans in CLCNKA, CLCNKB, BSND and NEDD4L (fig. (fig.1)1) using experimental and computational analyses. These populations vary with regard to incidence, prevalence and risk factors in the progression of hypertension. For example, African-Americans have a higher prevalence and suffer a disproportionate morbidity from the disease and more often exhibit salt sensitivity. Understanding genetic factors contributing to these differences among the various groups may lead to improved clinical management and better risk assessment. Our study provides the initial characterization of 4 genes that might have a role in essential hypertension. In general, the data demonstrates greater genetic variation in the population of African descent in comparison to the other groups, as has been demonstrated in comparable studies [16, 17]. However, this study observed a greater than expected ethnic allele frequency difference in comparison to the Perlgen database. Fst scores, haplotype analysis and LD blocks have not been previously reported for these genes.
The four genes we chose to examine in this study are involved in salt reabsorption in the kidney. CLCNKA and CLCNKB are located within 11 kb of each other on chromosome 1p36. They are >95% identical, presumably due to a historic gene duplication event, therefore, we characterized them together in this study. BSND, an accessory subunit to both CLCNKA and CLCNKB, is located on chromosome 1p31 while E3 ubiquitin-protein ligase NEdd4-like protein (NEDD4L) which interacts with ClC-Ka (CLCNKA) through barttin (BSND) is on 18q21. Although all four genes have been proposed to be candidate genes for studying genetic variations in the hypertensive population, only two, CLCNKB and NEDD4L have been shown to have an association with essential hypertension. CLCNKB-T481S, a reported gain-of-function variant, was initially associated with hypertension in Ghanaians and White Germans, but other studies have not been able to replicate this association [18,19,20]. Interestingly, despite multiple attempts we were unable to design primers to examine CLCNKB-T481S; and this SNP has not been genotyped by the HapMap project. Russo et al. conducted a haplotype study and demonstrated an association of haplotypes in NEDD4L and hypertension in three populations: White Americans, African and Greek Americans . To date there has not been any replication study of Russo et al. Aside from these two studies extensive examinations of allele, genotype frequencies and halpotype analysis have not been conducted to characterize CLCNKA, CLCNKB, BSND and NEED4L in distinct ethnic populations. Since the introduction of the HapMap, more studies including our group are now examining haplotypes as part of an association study; therefore our study will provide an initial characterization necessary for examining CLCNKA, CLCNKB, BSND and NEDD4Las part of a candidate gene, genome-wide or haplotype association study in various populations.
In this study, we defined haplotype blocks and noted that the number and diversity of haplotypes varied greatly from gene to gene, making generalizations about haplotypes across these candidate genes impossible. All four populations had consistently strong LD with one haplotype block in BSND with the same two SNPs (rs6682884 and rs2864124). These two SNPs clearly can be utilized as tag SNPs in candidate gene association studies in any of the four distinct ethnic groups we examined. This observation leads to the question: are tags SNPs population specific or is power comprised in haplotype association study when tagSNPs chosen from data in one population sample is examined in another sample. There are studies showing that tag SNPs chosen in one population (e.g. European-descent population) are often not appropriate for genotyping in a different population (e.g. African-descent population [22, 23]. Yet, others have shown that tag SNPs are transferable among multiple populations . Our study was not designed to answer if tag SNPs are population specific, yet in these four selected genes there are unique haplotype block structures within each population that need to be considered when designing population association studies.
CLCNKA_B exhibited strong LD in all groups except for African-Americans. Similar to BSND, all groups had one haplotype block where there were two SNPs (rs1739840 and rs1763597) that was shared by all four groups. African-Americans had less LD reflecting evidence for historical recombination that has been noted in several other studies. We were unable to design primer and probes on the to ABI SNPlex platform to examine the only gain-of-function variant, CLCNKB-T481S (rs12140311), associated with hypertension. However, this marker is located 283 bp from rs2050522, making it a reasonable proxy for CLCNKB-T481S and surrounding markers. Marker rs2050522 is in strong LD with rs6604910 within both the African-American and Caucasian populations, suggesting that haplotype analyses using these markers and CLCNKB-T481S should be a focus of future studies.
Similar to the findings with CLCNKA_B,African-Americans have less LD in haplotype block analysis of NEED4L. Caucasians, Chinese and Mexicans had longer haplotype blocks (4–6) for the same region. Russo et al. reported that one SNP (rs513563) in NEDD4L was associated with hypertension in both African and Caucasian Whites. In our study we found that this variant is present in high frequency in all four populations (MAF = 0.34–0.048). In addition, it exhibits strong LD in all four populations; therefore, this SNP should be utilized as a tag SNP in any haplotype analysis that examines blood pressure and NEDD4L not only in the African and Caucasian samples but also in the Han Chinese and Mexican populations.
Overall African-Americans had weaker LD structure than the other groups most notably within the NEDD4L gene. In NEDD4L (table (table4)4) African-Americans have several haplotypes with less than 5% frequency that together account for 59% of the haplotypes therefore making it difficult to use these cut-offs for African-Americans. Common BSND and CLCNKA_B haplotype frequencies in African-Americans were consistently less frequent but not significantly different than the other groups. However, haplotype frequencies for NEDD4L demonstrated significant differences with haplotype >5% in African-Americans accounting for insignificantly lower proportions of all haplotypes than in the other groups (table (table4).4). This could be a function of having more markers in NEDD4L, but more importantly this indicates that haplotype variation makes haplotype association studies more difficult especially in African Americans. Lastly, we note that the Han Chinese had no variation at four markers, three in NEDD4L and one in CLCNKA_B.
In this study we observed ‘yin yang’ haplotype pairs for all four genes (table (table5)5) . There are several explanations for the yin-yang phenomenon observed in this study. First, other characteristics being equal, yin-yang reflects the differences in genes with regards to lower LD, mutation and recombination rates. This observation in this study might be affected by the limited number of SNPs and genes examined but does reflect a substantially lower frequency of haplotypes compared to the number expected (2 vs. 4 or 2 vs. 8) . Understanding yin-yang phenomenon of a gene especially when considering studies involving complex diseases allows insight into the gene's LD, recombination and mutation rate.
Genes responsible for phenotypes that differ greatly between populations are expected to show large allele frequency differences between these populations and thus also demonstrate high Fst. In this study we calculated Fst for each SNP common across all populations. CLCNKA_B had 8/8 SNPs with Fst scores >0.05, 5/8 markers in BSND and 15/21 in NEDD4L had Fst scores >0.05. These genes have been proposed to be associated with hypertension, and they have moderately high Fst which may account for unusually large between-population differences in blood pressure and other intermediate phenotypes surrounding blood pressure. The extent of the differences was also seen in the comparisons of allele frequencies between populations, where all genes differed at more than 50% of the SNPs assayed except BSND between AA and CAU. The differences observed in this study were much greater than expected based on comparisons we have made using the entire Perlegen database (1,585,674 SNPs). In our previous analysis African Americans differed from Caucasians in 38.78% of the markers assayed, African Americans from Han Chinese in 45.01%, Caucasians from Han Chinese at 34.08% (personal communication, unpublished data, Velez, DR et al). These data support the argument that the genes we have studied are more different between populations than the average across the genome. There were no data for Mexican-Americans.
Our study demonstrated genes with high Fst that can be identified as candidate SNPs responsible for yet unidentified intermediate phenotypes affecting blood pressure that differ greatly between populations and thus novel genotype-phenotype relationships may be exposed in further exploring SNPs with high Fst. High Fst values can suggest either local positive selection or drift driving changes in allele frequencies. Although, one of the markers (rs2050522) in CLCNKA_B had an unusually high Fst (>0.5), it is not possible to definitely differentiate which factor is operating based on our data. In addition, markers or SNPs with low (0–0.05) Fst scores were further examined for conservation among species using ensembl genome browser, a comprehensive suite of programs and databases for comparative analysis of genomic sequences (http://www.ensembl.org/index.html). Figure Figure55 shows SNPs in circles denoting low Fst scores and >50% conservation across species results. Conservation was examined by sequence alignment where >50% of bases in alignments match across species. We observed that 36% of low Fst SNPs are also >50% conserved. SNPs with low Fst scores and conservation across species suggest that perhaps selection constraint may be a factor.
In summary, our data demonstrates initial characterization of four proposed hypertension candidate genes. Allelic frequency data and Fst values indicate the large variation that exists between the various ethnic populations. This study will serve as a starting tool for examining these genes and blood pressure phenotypes in distinct ethnic groups.
This work was supported by the National Institute of Health (DK071742, S.S.) and The Robert Wood Johnson Foundation (Amos Medical Faculty Development Program, S.S.).