|Home | About | Journals | Submit | Contact Us | Français|
The ‘rhythmonome’ is the term we have adopted to describe the set of genes that determine the normal coordinated electrical activity in the heart. Elements of this set include pore-forming ion channels, function-modifying proteins and intracellular calcium control elements. Rare mutations in many of these genes are known to cause unusual congenital monogenic arrhythmia syndromes, and single common variants have been reported to modify arrhythmia phenotypes. Here, we report an evaluation of the variation and haplotype structure in six key components of the rhythmonome.
SNPs were typed using DNA extracted from Coriell cell lines to survey allele frequencies and haplotype structure in six genes (ANK2, SCN5A, KCNE1 and 2 gene cluster, KCNQ1, KCNH2 and RYR2) across four human populations (African—American, European American, Han Chinese and Mexican American).
A total of 307 SNPs were analyzed across the six genes, revealing significant allele-frequency differences between populations and clear differences in haplotype structure.
The pattern of variation we report is an important step towards incorporating common variation across the rhythmonome in studies of arrhythmia susceptibility.
Normal electrophysiological activity of the heart is determined by ordered propagation of excitatory stimuli, resulting in rapid depolarization and slow repolarization and generating action potentials in individual cardiomyocytes. Physiological function of a large set of genes — to which we apply the term ‘rhythmonome’ — and their products is required for such normal activity . Rare mutations in many components of the rhythmonome cause syndromes with the high potential for sudden death due to arrhythmia in the young; these include the long QT syndromes (LQTS) and catecholaminergic polymorphic ventricular tachycardia (CPVT) [2,3].
Ion currents flowing across cell membranes through ion channel proteins determine the shape and duration of cardiac action potentials. The most common disease genes involved in LQTS encode the potassium and sodium channels, KCNQ1, KCNH2 and SCN5A. Other LQTS disease genes encode ion channel ancillary subunits (KCNE1 and KCNE2, located in a single gene cluster KCNE1 and 2) and ankyrin-B (ANK2), which modulates ion channel targeting to appropriate subcellular domains. The disease gene in autosomal dominant CPVT (RYR2) also encodes an ion channel — the intracellular ryanodine release channel, which is responsible for the increase in intracellular calcium that initiates contractile activity in the heart. Importantly, not only do mutations in these genes cause rare congenital arrhythmia syndromes, but polymorphisms — some are ethnicity specific — have also been implicated as a cause of variable susceptibility to common arrhythmias, including those triggered by drug administration [4-9]. However, the haplotype structure of these genes has not been determined and compared across ethnicities. Here, we report an analysis of common variation in these six genes in African—American (AA), European American (EA), Han Chinese of Los Angeles (HC), and Mexican American (MA) populations.
Study subjects were obtained from a large panel of anonymous, unrelated DNA samples from the Human Variation Collection of the National Institute of General Medical Sciences (NIGMS) Repository held by the Coriell Institute (NJ, USA). We specifically used sets of DNA samples obtained from four distinct ethnic groups residing in the USA, including 88 EAs, 88 AAs, 88 HC and 89 MAs.
We selected six genes from the rhythmonome to explore patterns of normal genetic variation: ANK2, SCN5A, KCNE1 and 2, KCNQ1, KCNH2 and RYR2. Using the Ensembl web interface , we defined regions using each gene of interest within EnsMart . Regions we examined included exons (coding regions, 5′ and 3′-UTR), introns and 10-kb regions flanking both sides of the gene. Once a region of interest was identified, we then searched the database for SNPs with a minor allele frequency (MAF) of over 10%, searching for validated, nonsynonymous, coding SNPs in particular. In cases where there was a dearth of SNPs using this criteria, we dropped the MAF, but it was never less than 2%. In rare instances, we also accepted nonvalidated, nonsynonymous SNPs. We did this in order to obtain a distribution of one SNP every 300–500 bp in the region of interest.
This list was verified to contain, or was supplemented with, common SNPs in the literature that are related to arrhythmia susceptibility. A final list of 753 SNPs was generated and entered into the SNPlex™ Design Pipeline (Applied Biosystems [CA, USA]). The SNPlex Design Pipeline segregates the selected SNPs into various predetermined design probe pools. These probe pool sets are then used to interrogate the samples on the SNPlex platform.
SNPs with a genotyping efficiency below 80% in one or more populations were dropped from all analyses. Hardy—Weinberg equilibrium was assessed for each SNP using Tools For Population Genetic Analyses (TFPGA) software, which is a program for the analysis of allozyme and molecular population genetic data, by conducting exact tests on all genotypes with a Monte Carlo approach (10 batches, 1000 permutations per batch). SNPs that significantly deviated from Hardy—Weinberg equilibrium (p < 0.01) in one or more populations were excluded from further analyses as this was a sign of poor genotyping quality. A total of 307 SNPs out of 753 (40.77%) were used in the remaining analyses.
Pairwise linkage disequilibrium (LD) was estimated for the SNPs in each gene using standardized summary statistics  D’ and r2, which were calculated using HaploView software . Haplotype blocks were assigned using the D’ confidence interval algorithm from Gabriel et al., which is implemented in HaploView . Haplotype frequencies for each block were estimated using two methods — the expectation-maximization (EM) algorithm using HaploView software, and the Bayesian haplotype reconstruction program PHASE version 2.1.1 [15,16]. Since KCNE1 and KCNE2 are only 74.5 kb apart, they were considered as one gene cluster (KCNE1 and 2) in this study.
Exact tests for allele-frequency differences in SNPs were conducted in the TFPGA software that employs an algorithm by Raymond and Roussett . For each SNP, we calculated Wright’s fixation index (Fst) — a measure of allelic variation that is owing to variation within a population compared with between-population variation. Fst values were calculated using the algorithm of Weir and Cockerham [18,19].
TagSNPs were identified for each gene within each ethnicity using the Tagger feature of HaploView with an r2 threshold of 0.8. Genotyping platforms were evaluated using probe lists obtained from vendors’ websites. Individual SNPs were matched to probe lists using reference sequence (RS) IDs, and only probes with vendor-specified RS IDs were used for matching.
To determine the presence of copy-number variants (CNVs) in the six genes of interest, we obtained the Affymetrix (CA, USA) SNP 6.0 GeneChip® genotyping data from the Coriell Human Genetic Cell Repository. The data consisted of 100 samples from each of the four different ethnic groups (EA, AA, MA and HC). The analysis was conducted using the Affymetrix Genotyping Console platform [20,21]. The analysis was also conducted using Partek® (MO, USA) software version 6.4.
The general haplotype structure of each gene is shown in FIGURE 1 by ethnicity. Detailed results of all statistical analyses for each gene and ethnicity are shown in the context of the haplotype structure in LD-plus [BUSH W, DUDEK S, RITCHIE M ET AL. SUBMITTED] composite figures (FIGURE 22 & 3 for KCNH2 and for other genes, see ). A total of five haplotype blocks were common to all ethnicities, and their haplotype frequencies are shown in FIGURE 4. In addition, there were eight blocks common to three of the four populations, and 29 blocks were common to only two of the four populations.
Allele-frequency analysis revealed significant differences in frequency distributions among the four different ethnic populations. In order to assess the degree of similarity in genetic structure among the different populations, we calculated Fst for all SNPs. Fst values range from 0 to 1 and increase as the allele frequency difference between the populations increases; a value of 0.05 or more is typically considered a significant difference in allele frequencies between the populations.
The haplotype structure of ANK2 is most clear in the HC population, with well defined haplotype blocks (FIGURE 1). EA and MA blocks maintain a similar structure with less well-defined edges. In the AA population, the large block from rs2285711—rs29311, which is characteristic of other ethnicities, is much more diffuse. There are two haplotype blocks common to all ethnicities. The first block in ANK2 includes two markers, rs1026089 and rs13108692 (7290 bp), and has two common haplotypes with approximately equal frequency in four populations (FIGURE 4). The second block in ANK2 includes two markers, rs13101312 and rs1351997 (9500 bp), and has two common haplotypes. For this block, the MA (0.70) and AA (0.73) populations have a different haplotype frequency distribution to the HC (0.59) or EA (0.56) populations (FIGURE 4).
African—Americans demonstrated the most statistically significant differences in allele frequency compared with the other ethnicities. There are two notable stretches of allele-frequency difference that constitute haplotypes more prominent in the AA population (rs961294—rs2135351 and rs17045935—rs2272230). Allele frequencies for six SNPs (rs13108692, rs1026089, rs689117, rs6533668, rs13131206 and rs7697312) in ANK2 were remarkably similar across the four populations (exact test: p > 0.05 for all comparisons).
Across all populations, Fst values for 45 of 81 ANK2 SNPs were greater than 0.05. Notable peaks in Fst values (> 0.15) occurred at rs2121928, rs6533660, rs313962, rs29307 and rs17045935. Population differences at these SNPs are mostly due to divergent allele or haplotype frequencies in AA populations.
The haplotype structure of the KCNE cluster is most distinctive in the AA samples, with fewer and smaller blocks than the other ethnicities. The EA samples have a more well-defined block structure from rs1012945 forward, whereas this region has more diffuse LD in the HC, AA and MA ethnicities. There are two notable stretches of unique allele frequencies in AAs, from rs2834465—rs2834467 and from rs2834477—rs2834480.
For KCNE1 and 2, the fewest statistically significant differences are between AAs and EAs with only 22 of the 60 markers differing in frequency. MA and HC allele frequencies are also similar, with only 26 of the 60 markers showing significant difference.
A total of 25 of 60 markers in the KCNE1 and 2 gene cluster had Fst values greater than 0.05, making this gene cluster the least diverse of all the six surveyed. One SNP, rs2834502, has an Fst of more than 0.15, owing to allele frequency divergence in the EA population. This SNP also occurs frequently in an EA haplotype.
With regard to KCNH2, EAs have a very discrete haplotype structure, and all SNPs had significantly different allele frequencies compared with the HC population. Three markers (rs1805120, rs740952 and rs2269001) remain linked in the AA population and form a frequent CGG haplotype that is part of an infrequent haplotype for a larger block in the HC population (rs2968855—rs2269001, CGCGG) (FIGURE 3). Conversely, the relatively rare TAA haplotype in AAs is part of the dominant CGTAA haplotype of the larger HC block (FIGURE 4). There is also a notable frequent allele sequence in the clear EA block (rs2373885—rs10277237, ATCG) that is less common in the MA and HC populations. The MA and HC allele frequencies for nearly all markers in KCNH2 are significantly different from each other and from the other two populations, so while the haplotype block structure between EA, HC and MA is similar, the haplotype frequencies are quite different.
Nearly all the SNPs in KCNH2 (16 of the 19) had Fst values greater than 0.05, making this an extremely diverse gene. Fst values plateau above 0.15 through the single AA haplotype block, driven by a frequent allele sequence of CGG in AA and EA populations that is of lower frequency in HC and MAs, where the TAA sequence is more prominent.
The EA and HC populations have very similar haplotype structures as well as allele and haplotype frequencies from rs800338—rs1459825. The AA population has a stretch of significantly different allele frequencies from the other populations between rs6578273 and rs2299618. The general haplotype structure is fairly similar across all ethnicities, with a slightly more developed block from rs2283205 to rs231879 in the EA population (FIGURE 5).
There was one common haplotype block in the KCNQ1 gene involving two markers, rs739677 and rs800338 (524 bp). Three frequent haplotypes were observed for these two markers, and each population had a unique haplotype distribution (FIGURE 4).
A total of 37 of 53 SNPs in KCNQ1 had Fst values greater than 0.05, highlighting the diversity of this gene. Notable Fst peaks (>0.15) occurred at rs2283172, rs163183, rs2237896 and rs231356. The high Fst for rs2283172 is owing to strong allele-frequency differences in the AA population. SNPs rs163183 and rs2237896 are part of haplotype blocks that have very different haplotype frequencies between AA, EA and HC populations. The Fst for rs231356 is driven mostly by divergence in the EA population.
European Americans, HC and MA populations all have very similar haplotype structures, but the HC population has a distinct allele frequency (FIGURE 1). The EA and MA populations show very little significant difference in allele frequency, with only five markers differing. Several SNPs (rs7374138, rs7372251, rs6599223 and rs7427574) demonstrate highly significant allele-frequency differences from the other populations.
There is one two-marker haplotype block common to all populations, involving rs6599224 and rs6599223 (2490 bp). The HC population appears to have a different haplotype frequency (0.64) to the other three populations for this haplotype block (0.83–0.92) (FIGURE 4). We observed a completely mismatched pattern in the CG and TC haplotypes of SCN5A, with major (CG) haplotype frequencies of 64–92% and minor (TC) frequencies of 8–36% among the ethnicities.
Only 17 of the 39 SNPs in SCN5A have Fst values greater than 0.05, which is an effect of the similarity between the EA and MA populations, and they punctuate differences in allele frequency in the HC and AA populations. SNP rs7372251 peaks in Fst value owing to highly significant allele-frequency differences in the HC population compared with all others and in the AA population compared with all other population.
The general haplotype structure of RYR2 is quite similar across the four populations, with the exception of a stretch of SNPs (rs2805422—rs12116442) that form a diffuse block of LD in the EA and MA populations, and a subset of these (rs1382581—rs12115442) that form a well-defined haplotype block in the HC population (FIGURE 5). These structures are almost completely absent in the AA population.
The RYR2 gene exhibited one three-marker haplotype block involving rs7526245, rs7526759 and rs7537963 (606 bp). There are two frequent haplotypes within this haplotype block, and all haplotypes have different frequencies in each population (FIGURE 4). The ‘yin-yang’ haplotypes are defined as two high-frequency haplotypes composed of completely mismatching SNP alleles; for example, nucleotides differ at every SNP in the haplotype pair . A pair of ‘yin-yang’ haplotypes, GTG and ACT, were observed in the first haplotype block (rs7537963—rs7526245) with frequencies between 37–81% and 19–63%, respectively, among the four populations (FIGURE 4).
For RYR2, there were a large number of statistically significant differences in allele frequency in comparisons involving AAs compared with other groups. A couple of stretches of markers demonstrated significant differences (rs2805422—rs2618661 and rs2819757—rs10925501) between AAs and all other populations. EAs and MAs were the most similar in allele frequency distribution, with only 13 of the 55 markers being significantly different.
Measured by Fst, RYR2 had substantial differences among ethnic groups, with 40 of the 55 SNPs in RYR2 having Fst values greater than 0.05 and nearly half (26 of the 55) had Fst values greater than 0.10. One SNP, rs12410114, had an Fst of nearly 0.30, residing at the tail of a diffuse block in EA and MA ethnicities, and a well-defined HC block. This Fst effect is driven largely by dramatic differences in haplotype frequencies, with the TA haplotype being dominant in HC and the AG haplotype being very frequent in the EA and MA populations.
Tagger analysis revealed that the HC population required the fewest number of TagSNPs (194) to capture all 307 variants (TABLE 1). The AA population, being the most genetically ancestral population, required the most TagSNPs (263). The large haplotype blocks found in ANK2, KCNE1 and 2 and KCNH2 can be exploited by using TagSNPs to dramatically reduce the amount of genotyping required, in some cases reducing the number of genotyped SNPs by half.
Based on the Affymetrix Genotyping Console platform, we identified one Coriell sample with a CNV in ANK2. We also observed four Coriell samples with a CNV in RYR2. One of these CNVs was replicated using the Partek software — a 4241-bp deletion in RYR2 from 235,341,289 to 235,345,530 (p = 0.0001).
To estimate the amount of information capture obtained in genome-wide association studies, we assessed how many TagSNPs for these genes are represented on the commonly used genotyping platforms, including Affymetrix Mapping 100K (AFFY100), Affymetrix Mapping 500K (AFFY500), Affymetrix Mapping 6.0 (AFFY1M), Illumina (CA, USA) HumanHap 300 (ILMN300), Illumina HumanHap 550 (ILMN550), Illumina HumanHap 650 (ILMN650) and Illumina HumanHap 1M (ILMN1M) (TABLES 22 & 3). As expected, more dense SNP arrays capture more of the 307 identified variants, both directly and indirectly using TagSNPs.
The rhythmonome is the term we have adopted to describe the set of genes that determine the coordinated, normal electrical activity in the heart. The discovery of rare mutations that induce unusual arrhythmia syndromes and common variants that modify more frequent arrhythmia phenotypes invited a more thorough characterization of ethnic differences in allele frequency, LD structure and haplotype frequencies across six key genes of the rhythmonome (ANK2, SCN5A, the KCNE1 and 2 gene cluster, KCNQ1, KCNH2 and RYR2). A complete characterization of all haplotypes would require resequencing each gene region in many individuals [23,24], but the results of this study provide a basic, working understanding of population differences in these six genes that is necessary for conducting association studies. We also provide an assessment of the representation of the selected SNPs on various large-scale genotyping platforms. These large-scale genotyping platforms are extremely popular for genome-wide association studies; however, as demonstrated in this manuscript, not all genes are covered equally, thus, supplemental genotyping may be advantageous. In addition, most of these platforms are catered for populations of European descent (with some additional SNPs for African descent populations) only. Because of the known ethnic variation in polymorphisms across the genome, there is clearly a need to be careful in selection of SNPs for association studies at the genome-wide or candidate-gene level. A variant of the SCN5A sodium channel (Y1102, rs7626962 — between rs7428882 and rs7431011) has previously been demonstrated to influence the risk of cardiac arrhythmia . This variant is common among individuals of African descent, and the AA population surveyed in this study shows a large degree of allele-frequency differences from the other populations near the location of this variant. There is a relatively high Fst value in the genetic region surrounding the variant, and punctuate stretches of allele-frequency difference suggest that the AA population, on average, harbors significantly different alleles in this gene to the other populations sampled. In addition, the scant patterns of LD in this area make using TagSNPs in this region problematic, and may indicate that common genome-wide association study platforms may not sample the variation of this gene well for AAs.
Studies have also reported rare polymorphisms in the KCNE1 and 2 gene cluster that are associated with increased torsades risk (D85N, rs1805128 — between rs2070357 and rs2070358) [25,26]. Should there also be common variants in this functionally critical region of these potassium channel genes, they are likely to lie on a haplotype background in both AA and MA populations. In EA and HC populations, this region lies on the edge of well-defined haplotype blocks. The region also harbors low genetic diversity, with few differences in allele frequency among the populations surveyed and low Fst values; therefore, it is possible that common variants with similar functional consequences to the rare polymorphisms of this gene would influence risk in multiple populations.
Ion channels are the primary regulators of normal cardiac rhythm, and abundant evidence is now emerging that common and rare variation in ion-channel genes modulate the risk of common arrhythmias, such as atrial fibrillation or sudden cardiac death [27-30]. Susceptibility to arrhythmias also appears to include an ancestry-dependent component ; for example, while AA subjects have a high incidence of traditional risk factors for atrial fibrillation, the incidence of the arrhythmia is actually lower than in other ethnic groups . The effect of drugs to treat or prevent arrhythmias is highly variable, and reasons for this include a genetic component factor . Thus, the present data point to future work in unraveling both arrhythmia susceptibility and variability in response to drug therapy.
Genes of the rhythmonome
Genetic diversity of the rhythmonome
This work was supported by National Institutes of Health grants HL65962. The authors would like to acknowledge Robert Woodhall and Nila Gillani for their contributions to the genotyping.
Financial & competing interests disclosure
The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
No writing assistance was utilized in the production of this manuscript.
Ethical conduct of research: The authors state that they have obtained appropriate institutional review board approval or have followed the principles outlined in the Declaration of Helsinki for all human or animal experimental investigations. In addition, for investigations involving human subjects, informed consent has been obtained from the participants involved.
Papers of special note have been highlighted as:
■ of interest
■■ of considerable interest