We sought to characterize the ability of a fully automatic image classification method to separate structural MRI brain scans of HD gene carriers in the presymptomatic phase from those of controls. Subjects with a more than 33% probability of clinical diagnosis of HD within 5 years were correctly separated from controls 69% of times without any a priori regional weighting. Although this accuracy is clearly above chance (see CIs in ), it is nowhere near perfect. It is interesting that whole brain classification accuracy—this study—falls substantially below the 82% correct classification achieved in an earlier study using an SVM on diffusion-weighted imaging (DWI) data less readily available in clinical practice than T1-weighted data.6
Subjects in the DWI study were unrelated to those of this one and as a group were estimated to be on average 19 years from clinical presentation. Although CIs will overlap, the suggestion is that diffusion imaging is better at classifying HD images. This conclusion is at odds with results from (univariate) VBM studies that show highly significant differences between PSC and control group T1-weighted images2
that are larger than those obtained using DWI.22,23
The differences in acquisition time (10 minutes for a T1 compared with 22 minutes [12 minutes without cardiac gating] for a DWI sequence) and the fact that the study reported here used a multicenter data set are two likely explanations for this apparent disagreement.
As expected, classification accuracy improved for PSC subjects closest to estimated symptom onset. The best performance was achieved when brain areas used for classification were limited to regions identified by VBM as affected in the PSC group. In general, a multivariate method that includes information from various brain areas should show favorable performance when more voxels (reflecting the volume of more brain regions) yield relatively more signal than noise. illustrates that group separation relies heavily on voxels within the caudate nucleus and particularly its head. Reduced gray matter reflected PSC status in insula and parietal cortex also; findings well in line with previous imaging studies.2,24
The figure also displays cortical voxels scattered throughout the brain, without a regionally specific pattern. These scattered voxels constitute a source of “noise,” which explains the superior performance of classification using the caudate alone, a procedure equivalent to minimizing noise.
illustrates the benefit of various levels of a priori information, which becomes most obvious for subjects in the middle group but also when all subjects are combined. In contrast, no meaningful classification accuracy was achieved in subjects far from estimated clinical onset, no matter how much a priori information was used.
VBM-derived prior information from an independent set of images served two purposes. We avoided overoptimistic claims and any circular logic about result generalization, which would have arisen had we created VBM-weighted images from the images that were also classified with SVM. VBM analysis also created a specific weighted group image that characterized the preclinical HD phase. The creation of similarly informative images could have been achieved using atlas-based masks of putamen and caudate. The approach we present here is more flexible. It allows the creation of disease-specific weighted images when disease distribution does not respect anatomic boundaries or is more widespread. A further advantage of our approach is that each voxel obtains a specific weighting. In contrast, anatomically based masks are normally binary and hence less specific. As expected, no improvement of classification was achieved when VBM derived T-maps were binarized (data not shown). Relatively labor-intensive manual outlining methods, often used in HD,25
would be less suitable for screening than the one presented here. A study comparing both approaches in early HD26
found that both methods reliably showed expected degeneration, but VBM detected additional changes in brain regions not selected a priori.
Performance was at chance level when we attempted to separate the subgroup far from clinical presentation from matched controls. Depending on the individual number of CAG repeats and age, subjects in this group were an estimated 20 years or more from developing signs of disease. It is a matter of debate when striatal degeneration starts. A large-scale study based on striatal volume change in PSC27
illustrates that decline of striatal volume is very subtle in subjects with more than 20 years to estimated onset but becomes substantially steeper around 15 years beforehand. VBM analysis confirms that structural changes were either absent or too subtle in the group farthest from onset to be detected in a group-level VBM analysis. In contrast, bilateral striatal gray matter loss found in the other subgroups confirms previous work using VBM.2
Classification performance was far from perfect. There is a wide range of techniques for extracting image characteristics to feed into various classification methods.3,9,11,28
The purpose of our study was to test gray matter–based SVM classification successfully applied to patients with mild to moderate AD on preclinical HD.7
The study in AD demonstrated the utility when cases were at a point where clinical signs were significant and disease-related atrophy was significant. Here we use genetic information not only to recruit individuals before the manifestation of any clinical deterioration, but also to estimate years to onset of disease and thus make use of the technique to detect the earliest and most subtle degenerative change in the brain. Both studies used data acquired at multiple imaging centers. Although this has to be shown for each disease, our work suggests that data can be exchanged between centers. If this proves true for other diseases, it would make excessive data acquisitions unnecessary and would facilitate the application to rarer neurodegenerative disorders.
Our results show that fully automatic detection of preclinical degeneration is possible so that identified subjects could become candidates for longitudinal follow-up in clinical trials, possibly many years before clinical presentation.27
It will be another topic of future studies to test whether multivariate classification methods such as those presented here can play a part in the detection of longitudinal changes alongside currently used, well-established imaging, cognitive, and behavioral changes.27