|Home | About | Journals | Submit | Contact Us | Français|
Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
QRS interval on the electrocardiogram reflects ventricular depolarization and conduction time, and is a risk factor for mortality, sudden death, and heart failure. We performed a genome-wide association meta-analysis in 40,407 European-descent individuals from 14 studies, with further genotyping in 7170 additional Europeans, and identified 22 loci associated with QRS duration (P < 5 × 10−8). These loci map in or near genes in pathways with established roles in ventricular conduction such as sodium channels, transcription factors, and calcium-handling proteins, but also point to novel biologic processes, such as kinase inhibitors and genes related to tumorigenesis. We demonstrate that SCN10A, a gene at our most significant locus, is expressed in the mouse ventricular conduction system, and treatment with a selective SCN10A blocker prolongs QRS duration. These findings extend our current knowledge of ventricular depolarization and conduction.
The electrocardiographic QRS interval reflects ventricular depolarization and its duration is a function of electrophysiological properties within the His-Purkinje system and the ventricular myocardium. A diseased ventricular conduction system can lead to life-threatening bradyarrhythmias, such as heart block, and tachyarrhythmias, such as ventricular fibrillation. Longer QRS duration is a predictor of mortality and sudden death in the general population and in cohorts with hypertension and coronary artery disease.1–3 In a population-based study, prolonged baseline QRS was associated with incident heart failure.4
Twin and family studies suggest a genetic contribution to QRS duration, with heritability estimates of up to 40%.5, 6 Prior candidate gene and smaller genome-wide studies identified a limited number of loci associated with QRS duration, supporting the hypothesis of the contribution of common genetic variation to QRS duration.7–9 To identify additional loci and highlight physiologic processes associated with ventricular conduction, we performed a meta-analysis of 14 genome-wide association studies (GWAS) of QRS duration in a total of 40,407 individuals of European descent, where we adjusted the analyses for age, sex, height, and body mass index after appropriate sample exclusions (Methods). After an initial discovery phase, we further genotyped selected variants representing nine loci with P-values ranging from 1 × 10−6 to 5 × 10−9 in an additional cohort of 7170 European individuals.
We conducted meta-analyses for approximately 2.5 million single nucleotide polymorphisms (SNPs) in 40,407 individuals of European ancestry from 14 GWAS (Supplementary Tables 1a and 1b). Overall, 612 variants in 20 loci exceeded our genome-wide significance P-value threshold of 5 × 10−8 after adjusting for modest genomic inflation (λGC = 1.059) (Figure 1 and Supplementary Figure 1). The loci associated with QRS interval duration are detailed in Table 1 and Supplementary Figure 2, with the index SNP (representing the most significant association) labeled for each independent signal.
Across the genome, the most significant association for QRS interval duration (locus 1) was on chromosome 3p22 (Figure 2a), where we identified six potentially independent association signals based on the linkage disequilibrium (LD) patterns in HapMap-CEU (pairwise r2 among index SNPs < 0.05). In conditional analyses where all six SNPs were included in the same regression model, there was compelling evidence that at least four SNPs from this region were independently associated with QRS duration (Table 1). Two of these associations were in or near SCN10A, a voltage-gated sodium channel gene. Variation at this locus was recently associated with QRS duration in two GWAS. The top SNP identified in those two studies, rs6795970, was in strong LD with our top signal, rs6801975 (r2=0.93).8, 9 Two additional signals were identified in SCN5A, a sodium channel gene adjacent to SCN10A (Table 1).
The second most significant locus (locus 2) was on chromosome 6p21 near CDKN1A, a cyclin dependent kinase inhibitor. The CDKN1A locus was recently associated with QRS interval duration in an Icelandic population.9 The index SNP in the prior report, rs1321311, was in strong LD with our top signal, rs9470361 (r2=0.88). Another cyclin dependent kinase inhibitor (CDKN2C) was located in locus 15, which encompasses several other genes including C1orf185, RNF11, and FAF1.
Locus 3 on chromosome 6q22 contains the PLN/SLC35F1/C6orf204/BRD7P3 cluster of genes. PLN encodes phospholamban, a key regulator of sarcoplasmic reticulum calcium reuptake. Significant associations were found in several other regions harboring calcium-handling genes, including locus 12 (STRN/HEATR5B), locus 16 (PRKCA), and locus 18 (CASQ2).
Locus 4 mapped to an intronic SNP in NFIA, a transcription factor. Several other significant loci also mapped in or near transcription factors including locus 5 (HAND1), locus 6 (TBX20), locus 8 (TBX5), locus 9 (TBX3), and locus 19 (KLF12). Common variation in TBX5 was recently associated with QRS duration.9 The index signal in the prior report, rs3825214, was in moderate LD with our top signal, rs883079 (r2=0.67).
Additional regions identified include locus 7 (SIPA1L1), locus 10 (VTI1A), locus 11 (SETBP1), locus 13 (TKT/CACNA1D/PRKCD), locus 14 (CRIM1), locus 17 (nearest gene, IGFBP3, is 660kb away), and locus 20 (LRIG1).
Collectively, the identified index SNPs across these 20 loci explained approximately 5.7% (±2.3%) of the observed variance in QRS duration, consistent with a polygenic model in which each of the discovered variants exerts only a modest effect on QRS interval. None of these index SNPs showed a significant interaction with sex or age after Bonferroni correction (Supplementary Table 2). We observed moderate levels of heterogeneity of the effect (25 < I2 < 75) for several index SNPs (Table 1). However, only HAND1/SAP30L showed significant evidence of heterogeneity using Cochran’s Q test corrected for 23 independent genome-wide variants (Cochran’s P = 0.005).
Based on the discovery meta-analysis, we selected the index SNPs at four loci (loci 15, 17, 19, and 20) with P-values ranging between 5 × 10−8 and 5 × 10−9 and from all five loci with P-values ranging from 1 × 10−6 to 5 × 10−8 (Methods) for genotyping in an additional 7170 European individuals in order to boost power. In a joint analysis combining all 47,577 individuals, the significance for the four loci with P-values between 5 × 10−8 and 5 ×10−9 increased, indicating these represent true positive associations (Table 1). The joint analysis also provided further evidence for two other loci (locus 21 near DKK1, and locus 22 tagged by an intronic SNP in GOSR2) that reached genome-wide significance, bringing the total number of significant loci to 22 with 25 independently associated index SNPs (Table 1). The index SNP (rs1733724) in DKK1 was previously associated with QRS duration in an Icelandic population.9
Based on this series of QRS associations, we sought to test the hypothesis that QRS prolonging alleles, on average, increase risk of ventricular conduction defects. To address this question, we calculated a risk score in each individual by adding up the number of QRS prolonging alleles identified in this study, weighted by the observed effect sizes (β-estimates) from the final meta-analysis. In an independent set of 522 individuals from the ARIC and RS studies with bundle branch block or nonspecific prolongation of QRS interval (QRS>120 ms) compared with those with normal conduction (N = 12,804), each additional copy of a QRS prolonging allele was associated with a 8% increase in risk of ventricular conduction defect (P = 0.004). This result was largely driven by those with non-specific intraventricular conduction defects as opposed to those with left or right bundle branch block (Supplementary Tables 3a and 3b). Similar results were observed using an unweighted genotype risk score.
Of 612 genome-wide significant SNPs, one in SCN5A (rs1805124, H558R, P = 2.4×10−18), two in SCN10A (rs12632942, L1092P, P = 5.1×1011, and rs6795970, A1073V, P = 5×10−27), one in C6orf204 near PLN (rs3734381, S137G, P=1.1×10−10), and one in CASQ2 (index SNP rs4074536, T66A, P = 2.4×10−8) were nonsynonymous (Figure 2 and Supplementary Figure 2). The PolyPhen-2 program predicts all five of these variants to be benign, which is consistent with small-effect associations: each copy of the minor allele was associated with cross-sectional differences in QRS duration of less than 1ms.
The 25 index SNPs (from Table 1) were subsequently tested for association with gene cis-expression levels in 1,240 PAXgene whole blood samples10. Four cis-eQTLs were detected after stringent Bonferroni correction (Supplementary Figure 3). The most striking eQTLs were observed for probes in exonic regions of TKT (rs4687718, P = 5.87×10−70) and CDKN1A (rs9470361, P = 1.41×10−10) and an intronic probe for C6orf204 near PLN (rs11153730, P = 1.54×10−10). We additionally assessed cis-regulation for all HapMap SNPs for these three loci (± 250kb around the SNPs). The top eSNP for TKT (rs9821134) and C6orf204 (rs11970286) were in moderate to high LD (r2 = 0.47 and 0.91, respectively) with the top QRS signals at these loci. However, the top eSNP for CDKN1A, rs735013, was only weakly correlated with the QRS index SNP rs9470361 (r2 = 0.089). In conditional analysis that included both CDKN1A locus SNPs in the regression model, both rs735013 and rs9470361 remained independently associated with expression levels (P = 1.7 × 10−9 and 2.3 × 10−5, respectively). Additionally, rs735013 itself was marginally associated with QRS duration (coded allele frequency = 0.39; β = 0.33 ms (±0.07); P = 2.4 × 10–6). Whether these associations in whole blood samples will be similar to associations in cardiac myocytes and conduction tissue deserves further investigation.
To explore the shared genetic underpinnings between atrial and ventricular depolarization and conduction (as measured by PR and QRS intervals) as well as ventricular depolarization and repolarization (QRS and QT intervals), we examined the effects of published PR and QT SNPs with respect to QRS interval. Several QRS loci were previously associated with PR or QT intervals, including PLN, TBX5/3, and SCN5A/10A, the last of which is associated with all three traits (Supplementary Table 4a). We also tested nine PR SNPs and 16 QT SNPs for their effect on QRS duration (Supplementary Table 4b).11–13 Our results suggest roles for CAV1/2 (rs3807989, P = 5.8 × 10−6) and NOS1AP (rs12143842, P = 1.3 × 10−4) in QRS duration. Indeed CAV1/2 was recently associated with QRS interval.9
QRS duration is positively correlated with both PR interval (r = 0.09) and QT interval (r = 0.44).9 To test if these relationships are also observed genetically, we compared the directionality of the association of SNPs at the published PR and QT loci with those for QRS duration. Generally, the effects of SNPs on PR interval were positively correlated with their effects on QRS duration (r = 0.53). With the exception of TBX3, the loci influencing both PR and QRS (SCN5A, SCN10A, TBX5, and CAV1/2), do so in a concordant fashion (i.e. variants that prolong PR also prolong QRS duration) (Figure 3 and Supplementary Tables 4a and 4b). By contrast, while QT and QRS are positively correlated at the population level, the effects of SNPs on QT interval were marginally negatively correlated with their effects on QRS (r = −0.08). Of the index SNPs at the four loci significantly associated with both QT and QRS (SCN5A/SCN10A, PRKCA, NOS1AP, PLN), only the PLN locus SNPs showed effects in the same direction (Figure 3 and Supplementary Tables 4a and 4b).
To examine the relationships between genetic loci associated with QRS duration, we developed an in silico relational network linking the loci based on published direct gene product interactions obtained from curated databases (Supplementary Figure 4).14 Most loci meeting genome-wide significance mapped to this network after a minimum number of “linker” nodes were incorporated to create a spanning network. This analysis provides a graphical overview of the interconnections among QRS-associated genetic loci and highlights both known and putative molecular mechanisms regulating ventricular conduction (see Discussion). Several of the “linker” nodes incorporated in the network, such as calmodulin, connexin 43 (GJA1), NEDD4, KCNMA1, and RYR2 are known modulators of cardiac electrical activity. Functional enrichment analysis of the QRS-associated network nodes (loci with P <5×10−8) using two independent software tools revealed that programs involved in heart development were highly over-represented (P-value range: 5.8×10−6 – 9.6×10−5).15, 16
We undertook functional studies to determine whether our most significant locus was associated with ventricular conduction in mice. Transcriptional profiling suggests that Scn10a/Nav1.8 mRNA is expressed in ventricular myocardium and at higher levels in the specialized conduction system.17 These data were confirmed and extended by qPCR (Figure 4a), demonstrating a 25.7 ± 1.1 fold enrichment of Scn10a/Nav1.8 in Purkinje cells compared to working ventricular myocytes (n=3 for each cell type; p=0.002).
Telemetric electrocardiographic recordings (lead II position) were obtained in conscious mice treated with A-803467, a potent Scn10a/Nav1.8 antagonist, which blocks Nav1.8 100 times more potently than Nav1.5 with the doses used.18 These studies demonstrated a significant increase in QRS duration (11.6 ± 2.6 ms to 14.5 ± .54 ms; n = 7; P<0.001), whereas vehicle alone was without effect (11.4 ± .29 ms to 11.9 ± .42 ms; n = 7; P=NS). PR interval was also increased in drug-treated mice, from 31.4 ± .98 ms to 42.5 ± 3.3 ms; n=7; P< 0.01), whereas vehicle alone resulted in no significant change (32.6 ± 1.0 ms to 33.4 ± .69 ms; n=7; P=NS) (Figure 4b). To further delineate the site of ventricular conduction slowing, we performed intra-cardiac recordings from mice treated with A-803467. These studies confirmed the significant increase in QRS duration (from 12.26 ± 0.62 ms to 14.56 ± 0.58 ms; n=7; P =0.015), whereas vehicle alone was without significant effect (12.39 ± 0.52 ms to 13.65 ± 0.97 ms; n = 5, P = NS). A-803467 treatment resulted in a 35.7% ± 1.2% increase in HV interval (from 9.33 ± 0.74 ms to 12.67 ± 1.06 ms; P = .009), whereas vehicle alone was without significant effect (10.67 ± .83 ms to 11.17 ± 1.10 ms; P = NS) (Figure 4c). Taken together, these data indicate that the QRS prolongation may primarily reflect conduction slowing in the specialized ventricular conduction system.
Our meta-analysis of 14 genome-wide association studies consisting of 40,407 individuals of European descent, with additional genotyping in 7170 Europeans, yielded genome-wide significant associations of QRS duration with common variants in 22 loci. Variations in four of these loci (locus 1, SCN5A/10A; locus 2, CDKN1A; locus 8, TBX5; and locus 21, DKK1) were previously associated with QRS duration in smaller independent studies using both candidate gene and genome-wide approaches.7–9 The 22 loci include genes in a number of interconnected pathways, including some previously known to be involved in cardiac conduction, such as sodium channels, calcium-handling proteins, and transcription factors, as well as novel processes not known to be involved in cardiac electrophysiology, such as kinase inhibitors, growth factor-related genes, and others.
The electrocardiographic QRS interval reflects ventricular depolarization and conduction time. Ventricular myocyte depolarization occurs via cardiac membrane excitatory inward currents mediated by voltage-gated sodium channels.19 The primary determinants of conduction velocity are the magnitude of excitatory inward currents flowing through these sodium channels, the extent of cell-to-cell communication via gap junction/connexin coupling, and cell and tissue architecture and morphology.19 Multiple pathways suggested in this study determine or modulate these key components of ventricular depolarization and conduction. Candidate genes in these pathways are briefly discussed in Box 1.
Of the 22 loci identified, common variants in four loci (SCN5A/SCN10A, CDKN1A, TBX5,and DKK1) were previously associated with QRS duration in genetic association studies. Mutations in two (SCN5Aand TBX5) lead to inherited syndromes associated with conduction disease. Animal experiments demonstrate a role for several additional loci (HAND1, TBX3,andTBX5) in cardiac ventricular conduction, as detailed below. The remainder are novel QRS loci, and their role in cardiac conduction remains to be elucidated.
Our strongest association signal (locus 1) mapped in or near two voltage-gated sodium channel genes: SCN5A and SCN10A. SCN5A encodes the cardiac Nav1.5 sodium channel and is well known for its role in cardiac conduction, and other cardiovascular and electrophysiologic phenotypes.20, 21 SCN10A encodes the Nav1.8 sodium channel. We provide novel data demonstrating that the SCN10A transcript and product is preferentially expressed in the mouse His-Purkinje system compared with the ventricular myocardium, and that Nav1.8 channel blockers result in QRS and HV interval prolongation, indicative of a slowing of impulse propagation in the specialized ventricular conduction system and delayed activation of the ventricular myocardium. Interestingly, Chambers et al. recently reported shortening of the PR interval in Scn10a knockout mice and concluded that Scn10a prolongs cardiac conduction and that rs6795970, encoding a Nav1.8 A1073V variant, is a gain-of-function allele.8 Alternatively, the more rapid conduction they observed in the knockout mice could reflect compensatory upregulation of TTX-sensitive currents, a 22 phenomenon observed in Nav1.8-deficient DRG neurons.
We, and others, demonstrated previously that, in addition to their association with QRS duration, variants in SCN5A and SCN10A are associated with atrial conduction (PR interval) and myocardial repolarization (QT interval), as well as atrial and ventricular fibrillation.8, 9, 13 These results emphasize the crucial role played by these genes in cardiac conduction and the generation of arrhythmias.
Calcium regulation is integral to impulse propagation, modulating cellular electrophysiology including sodium channel and gap junction function, as well as tissue architecture.20, 23, 24 Several of the loci associated with QRS duration contain genes directly related to calcium processes. As depicted in Supplementary Figure 4 and detailed in Box 1, these genes encode interrelated proteins that influence Ca2+ signaling (PLN in locus 3; PRKCA in locus 16; and CASQ2 in locus 18) and downstream effects (STRN in locus 12).
Transcription factors regulating embryonic electrophysiologic development are critical for the integrity of impulse conduction.25 We identified six transcription factors (TBX3 in locus 9; TBX5 in locus 8; TBX20 in locus 6; HAND1 in locus 5; NFIA in locus 4; and KLF12 in locus 19) in loci associated with QRS duration. Several of these transcription factors impact cardiac morphogenesis and may influence conduction by altering cellular and tissue architecture. Intriguingly, they may also have direct electrophysiologic consequences by modifying factors involved in impulse conduction. For example, HAND1 and T-box factors regulate connexin 40 (GJA5) and/or connexin 43 (GJA1), and TBX5 binds to the ATP2A2 (SERCA2A) promoter.26
Our study suggests a number of processes and pathways not previously known to be involved in cardiac electrophysiology, including cyclin dependent kinase inhibitors and genes related to tumorigenesis and cellular transformation. How these novel processes influence QRS duration remains to be defined.
In pleiotropic analyses, most variants influencing both PR and QRS, with the exception of TBX3, were concordant in effect direction, consistent with the known shared physiologic processes underlying the two traits: depolarization and conduction time in the sino-atrial, atria and atrioventricular node (PR) and depolarization and conduction time in the ventricles (QRS). By contrast, although QRS (ventricular depolarization) and QT (ventricular repolarization) are moderately positively correlated, most loci influencing both traits showed discordant effect directions (with the exception of the PLN locus). Investigating the physiologic foundations for these concordant and discordant PR-QRS and QT-QRS relationships could be particularly informative for elucidating the mechanisms by which these loci influence cardiac depolarization, conduction and repolarization.
Several limitations of our study should be considered. First, although we have identified 22 loci significantly associated with QRS duration, the broad nature of linkage disequilibrium among common variants generally precludes an unambiguous identification of the culprit variant or of the functional gene. For several genes (SCN5A, SCN10A, C6orf204, CASQ2), there are common coding SNPs in high LD with the index SNP, which may lend some support for a functional role for these genes. Furthermore, our expression analysis in blood revealed very strong cis-eQTL associations for TKT and CDKN1A, lending additional support to these genes as functional candidates. It would be desirable to perform similar eQTL analyses based on expression data in myocardial cells or conduction tissue. For our top signal in SCN10A, a gene which until recently was not known to be expressed in the heart, our functional work in mice confirm that SCN10A is involved in ventricular depolarization and conduction. Further fine-mapping is needed at all 22 loci to conclusively test all genetic variation (rare and common) for a role in QRS modulation.
To minimize the potential for confounding due to population substructure, we limited the analyses to individuals of European descent, for whom we could assemble the largest number of samples. At the individual study level, the GWAS showed very little evidence for gross stratification (genomic inflation factor, λGC, values ranged from 1.00 to 1.05). However, one of our QRS loci, mapping to HAND1/SAP30L, showed evidence of heterogeneity. In genetic association studies, heterogeneity can be due to sampling error, differences in phenotypic measurement, differences in LD structure between populations, technical artifacts, or genuine biological heterogeneity, but it would be difficult to conclude on the basis of our data here which is the most likely explanation.27
Our study underscores the power of a large genome-wide association study to extend prior biological understanding of cardiac ventricular conduction. Better understanding of the complex biologic pathways and molecular genetics associated with cardiac conduction and QRS duration may offer insight into the molecular basis underlying the pathogenesis of conduction abnormalities that can result in increased risk of sudden death, heart failure, and cardiac mortality.
Methods and any associated references are available in the online version of the paper.
Acknowledgements are available in the Supplementary Note.
N.S., A.A., D.E.A., P.I.W.d.B., E.B., H.C., A.C., C.M.v.D., M.E., S.B.F., G.I.F., A.R.F., J.F., V.G., P.v.d.H., S.R.H., A.A.H., A.H., A.I., S.K., H.K.K., C.N.-C., B.A.O., A.Pf., P.P.P., B.M.P., J.I.R., I.R., H.S., E.Z.S., B.H.C.S., A.G.U., A.V.S., U.V., H.V., T.J.W., J.F.W., A.F.W., N.J.S., Y.J.
A.A., D.E.A., L.H.v.d.B., R.A.d.B., E.B., M.J.C., A.C., J.M.C., A.F.D., M.D., C.M.v. D., R.S.N.F., A.R.F., L.F., S.G., H.J.M.G., T.B.H., P.v.d.H., C.H., G.v.H., A.I, W.H.L.K., N.K., J.A.K., A.K., L.L., M.L., F.-Y.L., I.M.L., G.t.M., P.B.M., G.N., C.N.-C., B.A.O., R.A.O., S.Pe., A.Pf., A.Pe., O.P., B.M.P., J.Q., F.R., J.I.R., I.R., N.J.S., C.S., M.P.S.S., M.F.S., E.Z.S., B.H.C.S., A.T., A.G.U., D.J.v.V., C.B.V., R.K.W., C.W., J.F.W., J.C.M.W., D.L., T.D.S
A.A., D.E.A., T.A., P.I.W.d.B., N.S., E.B., A.C., L.A.C., M.E., K.E., G.I.F., A.R.F., L.F., J.F., C.F., S.A.G., W.H.v.G., S.G., V.G., P.v.d.H., C.H., S.R.H., A.I., T.J., W.H.L.K., X.L., K.D.M., I.M.L., M.M., I.M.N., S.Pa., A.Pf., O.P., B.M.P., K.R., H.S., A.T., A.V.S., S.H.W., A.Y.W., N.J.S.
N.S., A.A., D.E.A, F.W.A., P.I.W.d.B., M.D., C.M.v.D,. M.E., G.I.F., J.F., S.A.G., V.G., C.H., A.I., Y.J., S.K., J.W.M., I.M.N., O.P., N.J.S., H.S., C.N-C., P.v.d.H.
A.A., D.E.A., T.A., F.W.A., J.C.B., R.A.d.B., E.B., H.C., M.J.C., A.C., J.M.C., L.A. C., A.F.D., M.D., C.M.v.D., M.E., K.E, S.B.F., G.I.F., A.R.F., J.F., W.H.v.G., V.G., T.B.H., P.v.d.H., C.H., S.R.H., G.v.H., A.A.H., A.H., A.I., Y.J., T.J., S.K., W.H.L.K., N.K., J.A.K., A.K., H.K.K., L.L., D.L., M.L., J.W.M., I.M.L., T.M., M.M, P.B.M., G.N., C.N.-C., I.M.N., C.J.O., B.A.O., S.Pa., S.Pe., A.Pf., A.Pe., O.P., B.M.P., F.R., J.I.R., I.R., M.P.S.S., M.F.S., D.S.S., H.S., B.H.C.S., E.Z.S., A.T., A.G.U., D.J.v.V., U.V., H.V., T.J.W., H.-E.W., A.V.S., S.H.W., J.F.W., J.C.M.W., A.F.W.
L.H.v.d.B., E.B., H.C., M.J.C., A.C., J.M.C., A.F.D., C.M.v.D., S.B.F., G.I.F., W.H. v.G., H.J.M.G., V.G., P.v.d.H., A.H., Y.J., S.K., H.K.K., L.L., P.B.M, G.N., C. N.-C., C.J.O., B.A.O., R.A.O., P.P.P., B.M.P., J.I.R., I.R., N.J.S., N.S, T.D.S., A.G.U., D.J.v.V., U.V., H.V., T.J.W., R.K.W., H.-E.W., C.W., J.F.W., A.F.W., D.L.
MetABEL - http://mga.bionet.nsc.ru/~yurii/ABEL/
1000 Genomes project - www.1000Genomes.org
Ingenuity - http://www.ingenuity.com
DAVID - http://david.abcc.ncifcrf.gov
Note: Supplementary information is available online.
Competing interests statement: Aravinda Chakravarti is a paid member of the Scientific Advisory Board of Affymetrix, a role that is managed by the Committee on Conflict of Interest of the Johns Hopkins University School of Medicine.