Diagnosis of ischemic stroke is based on clinical impression combined with brain imaging. However, in the acute setting, brain imaging is not always readily accessible, and clinical evaluation by persons experienced in stroke is not always readily available. In such patients, a blood test could be of use to diagnose IS. Several protein biomarkers have been associated with IS, but in the acute setting these have not yet shown sufficient sensitivity nor specificity to be clinically useful 4-6
. In this study we show that gene expression profiles could be used as biomarkers of IS, replicated our previous findings, and refined the gene expression signature of IS by including more relevant control groups.
We previously reported a 29-probe set profile that distinguished IS from healthy controls1
. When this profile was used to predict a larger cohort of patients in this study, it distinguished IS from healthy subjects with a sensitivity of 92.9% and specificity of 94.7%. This is important in that it represents a validation of the concept that gene expression profiles can identify patients with stroke. Replication of gene expression profiles has been a challenge in the field, in large part due to false discovery associated with performing multiple comparisons. Robust biological responses and careful analyses made it possible to validate this 29-probe set profile in this study.
To obtain more biologically useful predictors of IS, we identified gene profiles that distinguish IS from patients with vascular risk factors and MI. Using the individual group comparisons, we predicted the diagnosis of IS compared to the vascular risk factor group with over 95% sensitivity and specificity. Using the individual group comparisons, we differentiated patients with IS from MI with over 90% sensitivity and over 80% specificity. Biologically, this suggests at least some differences in the immune responses to infarction in brain and heart.
The 3 hour time point was a focus of most comparisons because this represents the critical time when decisions are made regarding acute therapy such as thrombolysis. Thus, for the development of a point-of-care test, this time period is when gene expression profiles could be of greatest use. With the 60-probe set signature, at the 3 hour time point, we achieved correct classification rates of 85-94%, 92-96%, 88% and 68-84% for IS, vascular risk factor, MI and healthy controls, respectively. These are approaching clinical useful ranges.
Though RNA profiles were the focus in this study, the identified genes could be used as a guide in the evaluation of protein biomarkers for ischemic stroke. Genes for Factor 5 and thrombomodulin were both identified as differentially expressed in IS compared to controls. Both of these molecules have also been identified as proteins associated with IS1, 7,12
. Many of the other genes we identified have not yet been studied, but may represent potential candidates for the development of protein biomarker profiles.
The goal of this study was not to identify all differentially expressed genes between IS and controls, but rather identify sets of genes whose patterns of expression may be useful for stroke diagnosis. As a result, these analyses have excluded large numbers of differentially expressed genes that are biologically relevant in IS. These will be the subject of future studies. Limitations of this study include (1) lack of stroke “mimics” in the control groups (2) lack of validation by qRT-PCR which would likely be used for clinical applications (3) the confounding treatment effects in the 5h and 24h blood samples from IS patients (4) race was not factored in due to different distributions with zero subjects in some of the race categories and (5) age is a confounder which we tried to address by factoring it in ANCOVA models and by selecting control groups with close age distribution to the IS patients. (6) Finally, an ANOVA for all of the groups combined yielded a significant number of regulated genes. However, these genes were not as predictive. This likely occurred because the PAM derivation of the training set of genes was not optimal, whereas individual group comparisons yielded more predictive genes. In the end, statistical validation was achieved by using our training set of genes to predict an independent test set of samples.