To our knowledge, this is among the first GWAS of an infectious disease and the first GWAS of KD. We have identified a number of novel variants using a staged study design and subsequent fine-mapping that are associated with KD susceptibility. These include variants within or close to genes that are functionally inter-related and that are plausible biological candidates in the KD pathogenesis. The magnitude of the effect sizes for KD susceptibility is comparable to that reported from other GWAS
[17]. Fine-mapping of associated and replicated SNPs has focused on more frequent variants that lie in known genes. In eight of these sixteen genes, fine-mapping confirmed the association and identified one or more associated haplotype(s), which will form the basis of resequencing to identify the disease-modifying variants.
The assertion that these variants are in (or close to) biologically relevant loci involved in KD susceptibility is supported by; (i) identification of eight loci containing one or more independently associated haplotypes identified by GWAS, replicated in an independent family-based study and subsequently fine-mapped, (ii) the significant differential gene mRNA transcript abundance of 5 of the 7 blood-expressed fine-mapped genes during acute versus convalescent KD, and (iii) the gene network analyses that suggest biologically plausible functional relationships, which are extremely unlikely to have occurred by chance, exist between five of the associated loci.
We focused on fine-mapping of associated SNPs that lie either in or within 5 kb of known genes and had a MAF of >0.05 in HapMap. These data represent the most robust associations and we will therefore focus our discussion on those genes, where putative functional relationships were suggested by IPA. We used IPA in an unsupervised manner, allowing identification of gene-gene relationships without a priori assumptions. This analysis linked five of the eight genetically associated genes, of which four form a functionally closely related network linked to eight other nodes in a highly significant network. The gene network suggests possible mechanisms by which one or more infectious triggers may lead to dysregulated inflammation and apoptosis, and cardiovascular pathology.
Central to the putative gene network is
CAMK2D (calcium/calmodulin-dependent protein kinase (CaM Kinase) II delta), whose expression was significantly down regulated during acute KD.
CAMK2D encodes the δ-isoform of CaM kinase II (NP_001212), a ubiquitously expressed calcium sensitive serine/threonine kinase. The δ-isoform of CaM kinase II is the predominant form expressed in cardiomyocytes and vascular endothelial cells
[18] and is involved in a number of pathophysiological processes that make it an attractive candidate in KD. In vascular endothelial cells CaM kinase II mediates nitric oxide (NO) production by endothelial synthase (NOS3, NP_000594) in response to changes in intracellular calcium and NO causes local vasodilatation
[18]. In acute KD NO production is increased and NO metabolites decrease following successful treatment
[19]. Following KD, especially where there has been overt CA damage, there is endothelial dysfunction and impaired vasodilatation, which can be restored after administration of antioxidants that may increase local availability of NO
[20]. More chronically, NOS3 may become dysregulated (‘uncoupled’) and produce potentially harmful superoxide anions, resulting in chronic oxidant stress that is implicated in the pathogenesis of atherosclerosis
[21]. In those with severe KD, NOS3 is expressed in coronary artery aneurysm tissue removed at surgery, and the tissue shows a pattern of senescence that is also typical of atherosclerosis
[22].
Involvement of leukocyte expressed CaM kinase II in blood vessel damage and aneurysm formation, key features of KD, is also plausible. In human monocytes, CaM kinase II modulates tumor necrosis factor-induced expression of CD44 (NP_000601), which has a central role in leukocyte migration and extravasation at inflammatory sites
[23]. CaM kinase II is also involved in disruption of the endothelial barrier following stimulation with agonists such as thrombin
[24], whose levels may be increased following KD
[25]. Disruption of barrier integrity in coronary arteries may contribute to leukocyte infiltration into the vessel wall, proteolysis of extracellular matrix proteins and the internal elastic lamina and subsequent coronary artery aneurysm formation, that is pathognomonic of KD
[1].
In addition, CaM Kinase II regulates endotoxin- and TNF-mediated apoptosis in human promonocytic cells by regulating the anti-apoptotic gene
BIRC3 (Gene ID:330)
[26]. Delayed apoptosis of leukocytes is characteristic of acute KD and may contribute to pathogenesis
[27]. Intravenous immunoglobulin (IVIg), standard therapy for KD, induces apoptosis of neutrophils in acute KD
[28]. In a genome-wide transcriptional study of KD, there was a marked over-representation of apoptosis regulatory genes
[16].
Both CaM Kinase II and the product encoded by another fine-mapped gene
LNX1, i.e. ligand of numb-protein X1 (NP_116011), interact with the NUMB family of proteins
[29],
[30]. Interestingly, one of the NUMB family members,
NUMBL (numb homolog (Drosophila)-like, Gene ID:9253) lies in the same small haplotype block that has recently been associated with KD susceptibility by a linkage study and subsequent fine-mapping
[12],
[13]. The
NUMB gene (Gene ID:8650) showed significantly higher transcript abundance during acute KD.
Both
LNX1 and
LNX2 (Gene ID:222484) (a closely related gene identified by the IPA network) encode proteins that bind the coxsackievirus and adenovirus receptor (
CXADR, Gene ID:1525)
[31].
CXADR is the receptor for coxsackievirus B3, which causes myocarditis in humans. The myocarditis can be prevented in animal models by antagonizing viral binding to CXADR (NP_001329.1)
[32]. Coxsackievirus B3 has also been implicated in acute myocardial infarction
[33]. Interestingly, the human endogenous retrovirus K protein Np9 also interacts with LNX1
[34] and therefore a number of viruses may theoretically bind the NUMB/CAR/LNX1 complex, leading to internalization and regulation of CAMK2D activity. This suggests a possible mechanism whereby more than one infectious trigger may result in cardiovascular damage in genetically susceptible individuals suffering from KD.
Other fine-mapped IPA-networked genes include
ZFHX3 (also known as
ATBF1), which encodes a large enhancer-binding transcription factor that is known to be polymorphic
[35] and interacts with a number of proteins, including PIAS3 (protein inhibitor of activated STAT, NP_006090) that inhibits STAT3 (signal transducer and activator of transcription-3, NP_644805)
[36]. STAT3 is activated by interleukin 6 (IL6, NP_000591) a pro-inflammatory cytokine that is involved in early innate immune reactivity, as indicated by the high fever, acute phase response with increased levels of CRP (NP_000558), complement factors and fibrinogen, in the blood as well as the myriad of cellular markers altered in acute KD
[37].
ZFHX3 also interacts with MYH7 (myosin, heavy chain 7, cardiac muscle, beta, NP_000248), in which mutations are known to cause an inherited form of cardiomyopathy
[38].
CSMD1 (CUB and Sushi multiple domains 1), which is functionally related to CaM kinase II via histone deacetylase 4 (
HDAC4, Gene ID:9759), may be associated with dampening the early phase of KD.
CSMD1 is located on chromosome 8, in a region that is hypervariable in humans and which contains numerous immune-related genes
[39]. Activation of the classical complement pathway occurs in acute KD
[40], and CSMD1 (NP_150094) is a complement regulatory protein that blocks the classical but not alternate complement pathway
[41].
The functions of other fine-mapped genes are generally poorly understood. The most significantly associated gene (
NAALADL2, N-acetylated alpha-linked acidic dipeptidase-like 2), which also showed the greatest change in transcript levels between acute and convalescent KD, is a large gene of 32 exons spanning 1.37 Mb.
NAALADL2 undergoes extensive alternative splicing leading to multiple 5′ and 3′ untranslated regions and variable coding sequences. Its function is largely unknown, but mutations in the gene may contribute to Cornelia de Lange syndrome (OMIM: 122470)
[42].
Overall we have identified five genetically associated genes that also had significantly reduced transcript levels during acute KD, including three that are closely functionally related (), suggesting that these genes may act together. This novel network may be distinct for KD as differences in transcript abundance in these genes have not been previously described as being part of a typical inflammatory response expressed in blood cells. Pathogen-specific host responses, identified by relative transcript abundance in the blood have been described for other infectious diseases
[43].
Investigation of transcriptome abundance in whole blood rather than specific cell populations allows assessment of the entire peripheral blood transcriptome
[43] and may be particularly informative in diseases such as KD, where an infectious trigger is implicated but remains unidentified
[1]. In a genome-wide gene expression study of KD, variation in neutrophil and lymphocyte numbers, characteristic of acute KD
[44],
[45], were thought to account for approximately half of the variation in transcript abundance during the course of the KD illness
[16]. Although we did not investigate individual cell populations in the current study, the data suggest that relative changes in transcripts reflect qualitative as well as quantitative differences. Given the enrichment of the expression profile with immune-related genes (selected on the basis of associated loci), the changes in mRNA may not be more numerous than those expected by chance and do not provide definitive proof for the gene-specific associations. While the number of subjects in our expression study is large enough to identify overall trends in the host response during KD, we are unable to comment on expression-related allelic association, which will be investigated in future studies. There is a suggestion from peripheral blood expression data in KD that ‘person-specific’ gene expression patterns, possibly reflective of underlying genetic variation, may be present
[16]. Further investigation of the relationship between genomic associations and gene expression will be undertaken, although clearly genetic variants may be significantly associated with disease without resulting in alterations in gene expression.
Our sample of 893 cases represents a large genetic KD cohort drawn from a single ethnic group. KD shares many features of other infectious diseases of young children, including fever, rash and changes to the mucous membranes. There is no diagnostic test and laboratory parameters individually have insufficient sensitivity or specificity for diagnosis
[1]. In all study cohorts we employed a conservative and widely accepted KD case definition in an attempt to maximize phenotypic homogeneity and diagnostic specificity. The similar ethnicity and ascertainment of KD cases in all cohorts reduces the risk of spurious associations
[15].
Our methodological approach is consistent with current best practice recommendations in GWAS design and analysis, which are aimed to identify robust associations and reduce type 1 errors
[15]: (i) the discovery and replication cohorts were recruited using very similar ascertainment techniques and drawn from predominantly Caucasian populations, with careful analysis to exclude cryptic population admixture in the discovery phase, which used a case-control design; (ii) the variants selected for replication were predominantly selected using single-point analysis, although we employed other models, including haplotypic analysis to maximize the informativeness of the initial GWAS data; (iii) we employed different genotyping technologies in each of the discovery, replication and fine-mapping stages to reduce spurious associations arising from genotyping errors; (iv) we limited our replication genotyping solely to variants identified in the discovery phase, as additional fine-mapping around associated variants in the replication phase may increase spurious associations
[46]; (v) we used a staged study design to avoid conservative correction for multiple statistical comparisons that might mask associations of moderate effect size in this modestly sized sample; (vi) we present joint analysis of the discovery and replication data, rather than considering the replication data in isolation and; (vii) we have fine-mapped variants with a MAF>0.05 which lie within or close to known genes.
We are aware that the genomic coverage and power of the discovery phase of the GWAS were limited and calculate that the initial GWAS had only approximately 50% power to detect an OR of 2.0 with alpha<0.05. Our relatively modest sample size reflects the difficulties in recruiting for a relatively rare disease in which the phenotype is defined clinically. Our approach therefore aimed to reduce the risk of type I errors by ensuring that a large and independent replication cohort was included as part of the initial design, as we did not expect the associated variants to reach genome-wide significance, given the cohort size in the GWAS discovery phase
[47]. It was therefore expected that neither previously reported and credible candidate gene associations in KD, such as
IL4 (Gene ID:3565)
[48],
VEGFA (Gene ID:7422)
[49],
[50],
CCR5 (Gene ID:1234)
[51], and
MBL2 (Gene ID:4153)
[52], nor the recently reported
ITPKC variant
[12] were replicated by the GWAS. Our study has failed to identify these and almost certainly other as yet unidentified variants that represent additional major determinants of KD susceptibility.
We have identified a number of novel associated SNPs, confirmed by fine-mapping, which lie within or close to previously unrecognized candidates for KD. The effect sizes, independent verification in different populations, differential transcript abundance and network analyses all indicate that at least a proportion of these variants represent novel genetic risk factors for KD. Some of the associated genes may interact to mediate the deleterious effects of infection-driven inflammation on the cardiovascular system. Further characterization of the associated genes and their functional interactions may lead to the identification of novel diagnostic and therapeutic targets in KD and may be informative about early pathogenic processes in other cardiovascular diseases.