|Home | About | Journals | Submit | Contact Us | Français|
Acute Kawasaki disease (KD) is difficult to distinguish from other illnesses that involve acute rash or fever, in part because the etiologic agent(s) and pathophysiology remain poorly characterized. As a result, diagnosis and critical therapies may be delayed.
We used DNA microarrays to identify possible diagnostic features of KD. We compared gene expression patterns in the blood of 23 children with acute KD and 18 age-matched febrile children with 3 illnesses that resemble KD.
Genes associated with platelet and neutrophil activation were expressed at higher levels in patients with KD than in patients with acute adenovirus infections or systemic adverse drug reactions, but levels in patients with KD were not higher than those in patients with scarlet fever. Genes associated with B cell activation were also expressed at higher levels in patients with KD than in control subjects. A striking absence of interferon-stimulated gene expression in patients with KD was confirmed in an independent cohort of patients with KD. Using a set of 38 gene transcripts, we successfully predicted the diagnosis for 21 of 23 patients with KD and 7 of 8 patients with adenovirus infection.
These findings provide insight into the molecular features that distinguish KD from other febrile illnesses and support the feasibility of developing novel diagnostic reagents for KD based on the host response.
Efforts to discern the etiology of acute febrile disease are hampered by the paucity of reliable discriminating clinical features, difficulties in obtaining appropriate specimens, insensitive methods for detecting known causative agents, and the lack of diagnostic tests for some conditions associated with fever, such as autoimmune diseases and adverse drug reactions. As a result, many acute febrile illnesses remain unexplained, especially in the early days after onset of clinical signs and symptoms. Molecular profiling of the host response offers an approach for classifying acutely ill hosts that complements traditional diagnostic approaches based on microbial detection . Studies of human genome–wide transcript abundance patterns in peripheral blood suggest that these patterns might provide useful information about the disease mechanism, outcome, nature of the infectious agent, and diagnosis [2–6]. This last possibility has not been adequately explored, especially in the setting of a clinical syndrome that presents important diagnostic dilemmas.
Kawasaki disease (KD) is an acute, self-limited inflammatory illness of infants and children ; ~25% of untreated patients develop coronary artery aneurysms or ectasia. Intravenous immunoglobulin (IVIG) reduces the rate of coronary artery aneurysms to ~5% when administered within the first 10 days of illness, but KD remains the leading cause of acquired pediatric heart disease in developed nations.
Despite 30 years of research, no etiologic agent has been identified for KD. In the absence of a specific diagnostic test, KD is diagnosed according to clinical criteria, many of which are shared by other illnesses characterized by rash or fever, including adenovirus infections, streptococcal scarlet fever, and systemic drug reactions. Many children with KD consequently receive an erroneous or late diagnosis, which leads to delays in treatment and an increased risk of coronary artery aneurysm formation [8, 9].
We recently examined whole-blood genomewide transcript abundance patterns in patients with KD and identified specific transcript levels associated with the risk of subsequent failure to respond to IVIG therapy . In the current study, we compared patterns of whole-blood gene expression in patients during the acute phase of KD with patterns found in patients during the early phase of 3 illnesses that have similar clinical presentations but well-defined alternative etiologies. We identified patterns of gene expression and corresponding biological programs that were different in patients with KD, compared to patients with the other illnesses, and we were able to distinguish between KD and adenovirus infection on the basis of gene expression patterns. The results from this study indicate that comparative analysis of host gene expression profiles is a promising approach for better understanding febrile illnesses and that such analysis may contribute to the development of a test for KD that enables more accurate and timely diagnosis.
Three groups of pediatric patients—patients with KD (n = 23), febrile control subjects (n = 18), and healthy, nonfebrile control subjects (n = 10)—were enrolled at 2 clinical sites (Rady Children's Hospital–San Diego and Children's Hospital Boston) after obtaining informed consent from parents or legal guardians. A human subjects research protocol was reviewed and approved by the institutional review boards at the University of California, San Diego, Children's Hospital Boston, and Stanford University. All patients with KD had fever and ≥4 of the 5 principal clinical criteria for KD (ie, rash, conjunctival injection, cervical lymphadenopathy, changes in the oral mucosa, and changes in the extremities) or 3 criteria in combination with coronary artery abnormalities documented by echocardiography . Coronary artery dimensions were recorded for all patients with KD.
Nasopharyngeal (NP) and stool samples for viral cultures were obtained from all febrile control subjects. Control subjects classified as having acute adenovirus infection had fever for ≥3 days, conjunctival or mucocutaneous changes, a negative result from culture of a throat sample for group A β-hemolytic Streptococcus (GAS), and an NP culture positive for adenovirus. Control subjects classified as having acute streptococcal scarlet fever had fever, a diffuse scarlatiniform rash, clear conjunctivae, NP and stool samples negative for virus on culture, and a positive rapid test result for GAS. Control subjects with systemic drug reactions had fever and systemic signs associated with ingestion of a drug known to cause hypersensitivity reactions, a throat culture negative for GAS, and NP and stool samples negative for virus on culture. Healthy pediatric control subjects (n = 10) were children <6 years of age undergoing minor elective surgery for polydactyly. Clinical data, including sex, ethnicity, age, day of illness (the first day of fever was defined as day 1 of illness), results of laboratory testing, response to IVIG therapy, and coronary artery status, were recorded for all subjects (Appendix). Blood samples were obtained for determination of complete blood count and differential; erythrocyte sedimentation rate; levels of C-reactive protein, alanine aminotransferase, and γ-glutamyl transferase; and RNA studies (PAXgene Blood RNA System; PreAnalytiX GmbH).
RNA transcripts in the samples and a standard reference RNA (Universal Human Reference RNA; Stratagene) were amplified using the MessageAmp aRNA amplification kit (Ambion). Sample and reference transcripts were then reverse-transcribed, labeled with fluorescent dyes (cyanine [Cy] 5 and Cy3, respectively), mixed together, and hybridized to complementary DNA (cDNA) microarrays (Appendix). The data are available from the Stanford Microarray Database  and NCBI's Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?accpGSE15297).
GeneTrail , the Molecular Signature Database , and NextBio (http://nextbio.com) were used to identify Gene Ontology (GO) terms, biological pathways from the Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/), and other array data sets associated with gene sets in our analysis. Terms in GeneTrail were identified using a false discovery rate of 5%. Significance analysis of microarrays was used to identify genes associated with differences between patients with KD and each control group at a median false discovery rate of 1% . Prediction analysis of microarrays with 10-fold cross-validation was used to identify and assess the predictive values of genes distinguishing patients with KD from those with adenovirus infection . Potential differences in clinical parameters were tested using nonparametric statistical analyses and Stata software (version 7; StataCorp).
Levels of ISG15, LY6E, and MX1 mRNA were measured using TaqMan 5′-nuclease Gene Expression Assays (Applied Biosystems) (Appendix).
RNA from the peripheral blood of 23 children with KD and 18 children with a well-defined illness that had a similar clinical presentation was processed and hybridized to cDNA arrays. The demographic and clinical characteristics of the subjects in these groups are presented in table 1.
To focus on the transcripts with the greatest differences in abundance among the subjects, we identified 808 transcripts that varied ≥3-fold from the median in ≥2 of the 41 arrays, and we used unsupervised clustering to organize the samples (figure 1). Sample cluster A formed 1 of the 2 main branches of the dendrogram and consisted primarily of samples from patients with KD; 14 of the 19 samples were from patients with KD, and 4 of the remaining 5 samples were from subjects with scarlet fever. Conversely, sample cluster B contained 7 of 8 samples obtained from subjects with adenovirus infection, along with 3 samples from subjects with systemic drug reactions and 1 from a patient with KD. The grouping of samples suggested that disease-specific gene expression was a prominent feature of the overall transcript profiles. We then compared the correlation coefficients for pairs of KD and non-KD samples from a given control group with the correlation coefficients for comparisons between KD samples; the transcript profiles of the subjects with adenovirus infection and or systemic drug reactions differed significantly from those of the patients with KD, but the profiles of subjects with scarlet fever did not (figure 2, which appears only in the electronic version of the Journal).
We used significance analysis of microarrays to identify specific transcripts that were significantly more or less abundant in the children with KD than in subjects from each of the 3 control groups  (table 2, which is available only as a spreadsheet in the online version of the Journal). In patients with KD, 130 transcripts were differentially expressed in comparison to subjects with adenovirus infections, 135 transcripts were differentially expressed in comparison to subjects with systemic drug reactions, and 4 transcripts were differentially expressed in comparison to subjects with scarlet fever; 29 transcripts were common to the first 2 comparisons. To identify additional transcripts whose abundance was similar across control groups but different from that in the KD group, we compared transcript levels in the patients with KD with those in the combined set of febrile control subjects; 31 additional transcripts were identified in this manner.
To identify putative functional processes associated with these transcripts, we clustered the set of 271 differentially expressed transcripts (figure 3) and applied several complementary approaches. We identified GO terms and annotated biological pathways that were overrepresented, we identified other array data sets that were enriched for genes present in each of these 4 gene clusters, and we incorporated information about the temporal pattern of expression for these genes from a previous study of KD .
There were 4 main clusters among the KD-associated genes, each with ≥20 transcripts and an average Pearson correlation coefficient of 0.5. The first of these clusters (C1) (figure 3) consisted of 45 transcripts that were more abundant in patients with KD than in control subjects. Forty-two transcripts were significantly less abundant in subjects with adenovirus infection than in those with KD, and 13 were less abundant in subjects with drug reactions. Genes represented in the C1 cluster were enriched for the GO term of contractile fibers (P < .01); these included genes encoding vinculin and the myosin light chain 4 and 9 subunits. Each of these genes is involved in modulating cell adhesion, morphology, and migration, as are a number of other genes in this cluster, including SPARC, ITGA2B, ITGB5, TUBB1, and TBCC. SPARC, SELP, ITGA2B, and TUBB1 are also highly expressed in platelets; this gene set was significantly enriched for genes expressed during in vitro differentiation of stem cells into megakaryocytes (P < .01)  and for genes associated with hematopoietic cell maturation (P < 10–5) .
Cluster 2 (C2) (figure 3) also contained transcripts that were more abundant in patients with KD than in subjects with adenovirus infections or drug reactions. Approximately half of the transcripts (54 of 100) were among those previously identified as defining the acute phase of KD in our study of temporal patterns of gene expression in KD . These included a number of genes encoding antimicrobial peptides (BPI, SLPI, DEFA1, PI3) and other genes associated with early innate immune responses (S100 genes S100P and S100A12, MMP9, SORL1, and PBEF). No GO terms met our criteria for significance, but we identified 2 overlapping Kyoto Encyclopedia of Genes and Genomes pathways, “apoptosis” and “insulin signaling,” that were significantly associated with this gene cluster (P < .01). Genes associated with these GO terms encoded proteins in signaling pathways linked to immune responses and inflammation, including interleukin (IL)–1 receptor type 1, IL-1 receptor–associated protein, IKBKG, PIK3R1, and PRKAR1A. Additional cytokine receptors (IL-4 receptor and interferon [IFN]–γ receptor subunit 1) were also present in this gene cluster. C2 also included many genes known to be expressed at higher levels in neutrophils than in other leukocyte subsets , as well as genes associated with differentiation of both CD34+ myeloid cells and monocytes (P < 10–14 and P < 10–4, respectively) [19, 20].
The 33 transcripts in cluster 3 (C3) (figure 3) were also significantly more abundant in patients with KD than in control subjects. Twenty-nine achieved statistical significance only when the pooled set of febrile control subjects was compared with the patients with KD, suggesting a consistent difference between patients with KD and the febrile control subjects within this cluster, but a less dramatic difference than within clusters 1 and 2. This impression was reinforced by a power calculation indicating that similar sample sizes would be needed to identify significant differences for each control group (figure 4, which appears only in the electronic version of the Journal).
C3 was highly enriched for genes whose expression is associated with B cells . The pattern of transcript abundance was associated with a gene expression program that characterizes a nonplasma cell stage of activation: PAX5, which plays a central role in the early activation and differentiation of B cells and whose expression must be repressed for development of plasma cells, was more highly expressed in patients with KD, as were genes known to be direct targets of PAX5 transcriptional activation (CD79A, MS4A2, IGHM, SPIB,HLA-DNA, TRAF5) . Unlike the expression of genes in C1 and C2, the expression in C3 did not diminish when the acute phase of the disease ended; the average transcript abundance for individual patients with KD was very similar weeks to months later, during the convalescent phase (P = .92; Student's paired t test).
The fourth cluster (C4) (figure 3) consisted of 60 transcripts that were less abundant in patients with KD than in those with adenovirus infection. The defining feature of this gene cluster was the presence of canonical IFN-induced genes, such as MX1, MX2, ISG15 (G1P2), IFIT2, OAS1, and OAS2. As a group, C4 was highly enriched for genes known to be expressed after cell stimulation with type I IFNs, both in vitro and in subjects with hepatitis C treated with pegylated IFN (all P < 10–15) [22–24]. To verify this dramatic difference in IFN-induced gene expression in a separate group of subjects, we used RT-PCR to measure levels of ISG15, LY6E, and MX1 in an independent cohort of 10 patients with KD, 12 subjects with adenovirus infection, and 8 healthy control subjects, matched for age and sex. Levels of these transcripts in the patients with KD were again significantly lower than those in subjects with adenovirus infection and similar to those in healthy children (figure 5).
The presence of multiple expression programs that differed between adenovirus infection and KD suggested that the 2 diseases might be distinguished using transcript levels. We used prediction analysis of microarrays to identify a set of genes that might differentiate between these cohorts. Using a set of 38 genes and 10-fold cross-validation, we were able to correctly identify 21 of 23 patients with KD and 7 of 8 with adenovirus infection (figure 6A). Sixteen genes were among the IFN-induced gene set, and the other 22 were distributed among the gene sets associated with cell adhesion or motility and innate immune responses (figure 6B).
We analyzed patterns of gene transcript abundance in the blood of subjects with acute KD, adenovirus or streptococcal toxin–mediated infections, or systemic drug reactions. These 4 clinical cohorts were chosen because experience and the published literature indicate that they can be clinically confused in the populations from which the patients with KD were recruited [25, 26]. We identified sets of genes whose transcript abundance levels were associated specifically with KD; the corresponding functional annotations point to cellular processes that may contribute to the unique pathogenic process that defines KD. A subset of these genes was used to distinguish patients with KD from those with adenovirus infections, with high sensitivity and specificity.
Two sets of genes were expressed at higher levels in patients with KD than in subjects with adenovirus infections or systemic drug reactions. The first was associated with expression programs that regulate and promote cellular adhesion and motility and are also highly expressed in platelets. The second contained many genes associated with innate immune responses generated by neutrophils. High numbers of platelets and neutrophils are characteristic of acute KD, but transcript levels of these genes are unlikely to result simply from differences in the relative number of these 2 cell subsets. Although platelet counts in the patients with KD in this study were higher than those in subjects with adenovirus infection or systemic drug reactions, they were also higher than in those with scarlet fever, despite the absence of a significant difference in gene expression. Similarly, there were no significant differences in relative neutrophil abundance among the different subject groups in this study that could explain the differences in expression of genes associated with innate immune responses. Despite known temporal variation in the expression of genes associated with innate immune responses within the acute phase of KD , the similarities in day of illness for the patients with KD, adenovirus infections, and scarlet fever also make it unlikely that this variable alone explains differences in the abundance of transcripts for these genes (table 1).
The expression of these genes may instead reflect the activation state of the cell subsets involved and may help explain the cellular events that contribute to coronary artery damage in patients with KD. Platelet activation has long been recognized as associated with coronary artery aneurysm formation in KD , and platelets can respond to and generate inflammatory signals in ways that modulate their interaction with both neutrophils and blood vessel endothelia (reviewed in Zimmerman et al ). Similarly, it has become clear that variation in neutrophil gene transcription is associated with differences in neutrophil function .
A third set of transcripts that were more abundant in patients with KD than in each of the 3 control groups was enriched for genes associated with B cell differentiation and activation . The presence of PAX5 and IGHM in this set of genes is consistent with an increased proportion of either mature naive B cells or immunoglobulin M (IgM)–positive memory cells in KD. Patterns of transcript abundance in memory B cells and naive cells share similarities when compared with plasma cells . Kuo et al found a subset of the genes associated with KD among those expressed at higher levels in memory B cells than in plasma cells . Our findings are also consistent with a recent report that CD180 (Ly64) is expressed in a higher fraction of B cells in patients with KD than in febrile control subjects ; CD180 is also expressed at higher levels in memory B cells than in either naive or plasma cells . Studies of purified B cell populations are needed to determine whether the expression pattern we observed in this study is a memory B cell signature, but these findings may help explain previous reports that peripheral blood mononuclear cells from children with KD spontaneously secreted immunoglobulin at higher levels than those from febrile control subjects ; memory B cells are primed for rapid activation and differentiation into antibody-secreting plasma cells, and recent studies suggest that IgM-positive memory B cells respond to both T cell–independent and T cell–dependent antigens .
The fourth set of transcripts that distinguished patients with KD from febrile control subjects consisted of IFN-stimulated genes, which were expressed at lower levels in patients with KD than in those with adenovirus infections. Lower expression levels of these genes in subjects with acute KD than in both subjects recovering from KD and healthy control subjects suggest that an IFN response is absent in patients with KD. It is unlikely that the pattern of expression we observed represents a defect in the immune repertoire of patients with KD; unlike individuals with known defects in IFN signaling, children who have had KD do not suffer from a general susceptibility to infections. The absence of an IFN response does not necessarily imply that KD is not caused by a virus. Many viruses evade the IFN response by a wide variety of mechanisms, including inhibition of IFN production and of IFN signaling, both of which would decrease the expression of IFN-stimulated genes (reviewed in Randall et al ).
Adenovirus infections are particularly problematic in the differential diagnosis of KD because the currently available rapid test for this common group of viruses is very insensitive, unlike the one for streptococcal disease . A PCR-based test using broadly cross-hybridizing primers is highly sensitive and specific but remains a research tool . Immunofluorescence-based tests suffer from poor sensitivity (~50%), although these tests are highly specific, and a positive test result for cells from the nasopharynx suggests current infection . Many of the laboratory features associated with acute adenovirus infection, such as elevated white blood cell counts with neutrophil predominance and high levels of acute-phase reactants (eg, fibrinogen and C-reactive protein), parallel those associated with acute KD . Using a subset of the differentially expressed genes identified in this study, we were able to correctly identify 21 of 23 subjects with acute KD and 7 of 8 subjects with adenovirus infection.
We recognize several limitations to our study. First, small sample size and a restricted set of control illnesses may limit the ability to generalize from our results; larger studies that include additional diseases are needed. Second, although subjects were classified on the basis of generally accepted clinical and laboratory criteria, some may have been misclassified.
Our findings have implications for future research into the etiology and diagnosis of KD. First, the search for an etiologic agent should focus on agents that either suppress or fail to elicit a robust type I IFN response in vivo. Second, adenovirus infections are among the diseases most often confused with KD (25); efforts to develop improved diagnostic tests for KD might be well served by including both human gene products that are highly expressed in adenovirus infection but not in KD and gene products with the inverse differential abundance profile. Combinations of markers, some derived from host response and some from possible causative agents, may provide the most effective approach for diagnosis and classification in acute febrile subjects.
We wish to thank Jane Newburger for providing samples and clinical information, Patrick Brown and Elizabeth Joyce for comments and discussion, Joan Pancheri for sample collection, DeeAnna Scherrer for RNA isolation, and Kristy Coolley for assistance with sample preparation and microarray hybridization.
Financial support: National Institutes of Health (grant NHLBI R01-HL69413 to J.C.B. and J.T.K.); Horn Foundation (D.A.R.).
IVIG nonresponse was defined as persistent or recrudescent fever (≥38°C [100.4°F] rectally or orally) 48 h after completion of the IVIG infusion (2 g/kg). The internal diameters of the right coronary and left anterior descending arteries were classified by echocardiography as normal (z score, <2.5 [standard deviations from the mean, normalized for body surface area] ), dilated (z score, 2.5 to <4.0), or aneurysmal (saccular or fusiform dilatation of a coronary artery segment; z score, ≥4.0).
PAXgene samples were stored at 4°C on the day of collection, and RNA was extracted within 1 week, in accordance with the manufacturer's directions. Full details of the protocols used for array hybridization are available at http://cmgm.stanford.edu/pbrown/protocols. Images of hybridized arrays were obtained using a GenePix 4000B microarray scanner and analyzed with GenePix software (version 5.0; Axon Instruments). The arrays used for these studies contain 37,632 spots derived from cDNA clones representing ~18,000 unique human genes .
Data were filtered to include only clones that met the following criteria for ≥80% of the samples tested: signal intensity 2.5-fold above background in either the Cy5 (sample) or Cy3 (reference) channel and a regression correlation for the 2 channels of ≥0.6 across each measured element. A normalization factor was applied so that the mean log2 ratio for each array (sample) was 0, and data for each clone were then median centered across all observations. Selected data were organized using a hierarchical clustering algorithm based on a Pearson correlation metric, with average linkage clustering , and visualized using TreeView software, version 1.1.1 .
cDNA was synthesized using Oligo(dT)20 (Invitrogen) and Superscript III reverse transcriptase (Invitrogen), in accordance with the manufacturer's instructions. PCRs were prepared using the TaqMan Universal PCR Master Mix (Applied Biosystems) and cDNA derived from 20 ng of total RNA. The relative abundance of the target transcripts was calculated by comparison with a standard curve and normalized to the expression level of TATA box binding protein–associated factor, RNA polymerase I, B (TAF1B). The Applied Biosystems assay catalog numbers are as follows: ISG15, Hs00192713_m1; LY6E, Hs00158942_m1; MX1, Hs00182073_m1; TAF1B, Hs00374547_m1.
Potential conflicts of interest: none reported.
Presented in part: Pediatric Academic Societies and Asian Society for Pediatric Research Joint Meeting, Honolulu, Hawaii, 2–6 May 2008 (abstract 6430.1).