We demonstrate that a combination of cross-sectional and longitudinal FDG-PET information results in classification performance that is in line with the current state-of-the-art. For the most commonly reported classification task of separating AD patients from HC, our accuracy of 88% is comparable with other recent classification results based on multi-modality imaging and non-imaging data (
Hinrichs et al., 2011;
Zhang et al., 2011), and also with the results of high-dimensional pattern recognition methods applied to cross-sectional MR imaging data (
Cuingnet et al., 2011;
Chupin et al., 2009). Classification results may well be converging on a “glass ceiling” for this task, since diagnostic consensus criteria themselves have an accuracy of around 90% (
Ranginwala et al., 2008). For FDG-PET in particular, it is also important to consider the further confounding factor that approximately 10% of the ADNI AD patients have a pattern of glucose metabolism that is more consistent with frontotemporal dementia (
Thiele et al., 2009;
Jagust et al., 2010).
We additionally attempt the less commonly reported classification task of separating pMCI from sMCI patients. Our accuracy of 65% is encouraging compared with the most directly comparable studies based on MR imaging data (
Cuingnet et al., 2011;
Wolz et al., 2010). It has been reported that progression from MCI to AD occurs at a rate of 10–15% per year (
Petersen, 1999), with up to 80% of MCI patients developing AD over a six year period (
Petersen, 2004). To properly assess the utility of any classification method in predicting such progression, longer clinical follow-up is therefore required than is currently available for the ADNI participants.
To verify that the regional features used for classification made biological sense, we performed t-tests between clinical groups to assess which regions gave statistically significant group differences. Although a direct visualisation of the SVM weight vector would be desirable, because of the nonlinearity of the kernel used, it was not possible to map the weight vectors learned in the transformed feature space back to the original feature space in a meaningful way. We therefore explored univariate changes using t-tests for the purposes of visualisation. When considering the cross-sectional data, regional t-values between AD patients and HC indicated significant differences across most of the brain. This finding is consistent with the voxel-wise t-tests reported in
Yakushev et al. (2009). The most significantly different regions between groups included those known to be affected in AD for all three feature sets, consistent with previous voxel-wise t-tests performed on the ADNI FDG-PET data (
Langbaum et al., 2009;
Chen et al., 2010).
Similarly to
Hinrichs et al. (2011), we found that the percentage change in signal intensity over 12 months alone does not provide particularly impressive classification performance between AD patients and HC (74% accuracy). Although the longitudinal data alone appear insufficient for matching state-of-the-art classification performance, our results demonstrate that they can provide some complementary information which can enhance classification when used in conjunction with cross-sectional features. This suggestion is supported by our t-test results, which show that the pattern of regional significances differs between cross-sectional and longitudinal data. For example, the amygdala is identified amongst the best five features for group discrimination only for the longitudinal data. The two cross-sectional feature sets, on the other hand, have similar patterns of regional significances, although improved group discrimination may be achieved with the 12-month data.
We additionally performed all classification experiments after accounting for the effects of age and gender by linear regression. The lack of significant effect on accuracy observed in the majority of cases indicates that the classification results were truly based on disease-specific imaging information, rather than the intrinsic age and gender information also captured in the images. The significant improvement observed between pMCI and sMCI patients is in agreement with our previous findings (
Gray et al., 2011c,
b). In our previous work, we had performed a global regression, whereby the coefficients for age and gender were estimated using all the available data from all four of the diagnostic groups. We have since demonstrated that there is little appreciable difference between the effect of the two regression approaches.
The aim of this work has not been to introduce a novel classification approach, but instead to use a readily available SVM classifier and simple feature combination approach (direct concatenation) to demonstrate the utility of longitudinal FDG-PET information for improving classification amongst four clinically relevant pairs of diagnostic groups. Having established that the longitudinal features can indeed enhance the results achieved using cross-sectional data alone, it may be beneficial to investigate the application of kernel combination methods, which are reported to be superior to simple concatenation for combining feature sets (
Zhang et al., 2011). Additionally, the possibility of multi-class classification could be investigated, for example, using the LIBSVM “one-against-one” multi-class strategy
Chang and Lin (2011).
An important consideration of the described regional FDG-PET analysis approach is its requirement for MR imaging data. Structural imaging, either with MRI or CT, is routinely used in clinical practice to identify brain lesions that could lead to a clinical picture mimicking a diagnosis of AD. Both MRI and FDG-PET are mentioned in the revised AD diagnostic criteria (
McKhann et al., 2011;
Albert et al., 2011;
Sperling et al., 2011) as providing potentially useful biomarkers, and the recent development of hybrid MRI-PET technology means that the simultaneous acquisition of both modalities could become a practical solution for dementia imaging in the future. For example, one such system has been approved for use in clinical practice in both Europe and the USA, and its clinical application in oncology has already been demonstrated (
Drzezga et al., 2011). The requirement for MR data has the key benefit that regional volumes and volume changes are also available for each patient, and these data could potentially be combined with the FDG-PET information.
There are two methodological image processing issues which are important to discuss. The first concerns our decision to nonrigidly propagate the baseline segmentations to follow-up space, rather than, for example, using the MAPER segmentation procedure to automatically generate independent follow-up segmentations. Despite the fact that erroneously labelled voxels in the baseline segmentation are propagated to the follow-up image, intra-subject consistency of the segmentation is important for measuring longitudinal change (
Crum et al., 2001), since uncorrelated errors lead to greater measurement uncertainty. The second is the issue of FDG-PET image normalisation. The reference cluster normalisation method (
Yakushev et al., 2009) was proposed as a data-driven method, with the cluster derived directly from the image data. However, we used an independently derived cluster for normalisation to avoid introducing bias into the classification process. It was important to first assess the validity of this approach, by determining whether the regions identified as relatively preserved in
Yakushev et al. (2009) are also valid for the ADNI dataset. We therefore derived a reference cluster using the ADNI FDG-PET images, and calculated the intraclass correlation coefficient (ICC) between the values obtained by sampling this ADNI derived cluster and those obtained by sampling the independently derived cluster. The resulting ICC of 0.95 suggests that the area of the brain identified is reliably preserved across early AD and MCI, and thus is likely to provide a robust and portable reference region for image normalisation.
This study demonstrates that information extracted from serial FDG-PET through regional analysis can accurately discriminate diagnostic groups, even at the early symptomatic stages of the disease. This finding may be usefully applied in the diagnosis of Alzheimer’s disease, predicting disease course in individuals with mild cognitive impairment, and in the selection of participants for clinical trials. Importantly, we demonstrate the utility of serial regional FDG-PET for patient classification in a realistic multi-centre setting. Although the use of longitudinal data for the clinical diagnosis of AD is not necessarily practical, its use for stratification of pMCI versus sMCI patients could still be valuable. For clinical trial recruitment, in particular, it may well be acceptable to use longitudinal information acquired over 12 months to gain additional certainty about whether a candidate fits the selection criteria.
We have identified several areas for further research. We have already begun to explore some of the possibilities, such as using a more sophisticated method for data combination, and making use of both MRI and PET in combination (
Gray et al., 2011a). In the future, we intend to additionally investigate the incorporation of non-imaging data, such as CSF biomarkers or genetic information. Machine learning techniques using cross-sectional FDG-PET data have been successful in discriminating AD patients from those with frontotemporal dementia (for example,
Kippenhan et al. (1994);
Xia et al. (2008)), and we would be interested to investigate the effect of incorporating longitudinal information on such differential diagnoses. While it is possible that the ADNI dataset contains some patients with other dementias, such as frontotemporal dementia or dementia with Lewy bodies, these patients are not clinically labelled as such. To perform a thorough study on differential diagnosis, a large and varied cohort of dementia patients with autopsy-confirmed clinical diagnoses would be required, such as that described in
Silverman et al. (2001).
Research Highlights- cross-sectional and longitudinal FDG-PET for classification of Alzheimer’s disease
- multi-region FDG-PET analysis using automatically generated whole-brain segmentations
- combining cross-sectional and longitudinal features improves classifier performance
- state-of-the-art classification of diagnostic groups using serial FDG-PET