We have described a method of brain atrophy measurement from serial MR imaging that addresses the problem of differences in tissue contrast and SNR over time and between scanners. The method involves tissue-specific intensity normalization to improve consistency over time, and automated BSI parameters selection based on image specific brain boundary contrast to improve consistency between scanners. The method was applied to over 300 baseline and 1-year volumetric MR image pairs acquired in a large multi-site imaging study of controls and AD subjects (ADNI). The new method, KN-BSI, reduced the number of subjects required in a hypothetical multi-site clinical trial of drug treatment in AD by an estimated 32% (95% CI 18% to 45%), compared to classic-BSI. Confidence intervals are often not reported for estimates of sample sizes or their ratios, limiting interpretation and comparisons of estimates between studies.
shows that KN-BSI is more robust to the artifacts in images with poorer or less consistent image quality (score 4) than classic-BSI. The mean atrophy rates using KN-BSI using images with any of the quality scores were very similar. However, this was not true for classic-BSI: the atrophy rates using images with poorer quality (score 4) were different to and importantly were more variable (higher SDs) than the better quality image pairs.
The quality scores inevitably involve some arbitrary judgments which may be influenced by multiple factors including, not only changes in tissue contrast, which the KN-BSI seeks to improve, but also other factors such as movement or other artifacts. The ADNI MR dataset is unique in the terms of the effort that went into protocol development and the ongoing quality control process to try to ensure that images were consistent across sites and over time (Jack et al., 2008
). Despite these efforts, there were significant changes over time, which is inevitable with multi-site studies that last several years. The more variable the tissue contrast between sites and over time, the more important it will be for techniques such as KN-BSI to try to minimize these confounds. Clearly future studies will need to prioritize stability over time in terms of MR acquisition stability.
Our intensity normalization method is closely related to the work by Nyúl and Udupa (1999)
, which suggested that intensity in MR image could be mapped to a standardized range by using the modes in the histogram. In this work, we used k
-means clustering to automatically find these modes which correspond to CSF, GM and WM, although our method is not dependent on a specific classification technique.
Tissue intensity changes may be caused by underlying neuropathology in neurodegenerative diseases. Neuropathology such as hydration state, cell content (e.g. neuron loss, gliosis) and chemistry changes the tissues at a cellular level, which will be reflected in changes in tissue intensity in MR images over the long term. This, however, is of a much lower effect than the changes that can be introduced by scanning equipment. The changes due to upgrades of scanners or differences in patient positioning may be an order of magnitude greater than those that are due to the underlying neuropathology over 1 year for neurodegenerative diseases. In particular, in Alzheimer’s disease, studies looking at T1
values (Ramani et al., 2006
) found that the difference between AD patients and controls, is relatively small, but nonetheless should not be ignored. Volume changes (atrophy) will be over and above these effects, and the BSI is particularly looking at changes at the boundary between brain and CSF and as such will be less sensitive to changes in intrinsic tissue signal intensity.
We showed that the intensity window in BSI can be automatically and objectively chosen, based on the mean and standard deviation of signal intensity in different tissue classes. Although the automatically chosen intensity windows were similar to the manually chosen intensity window (), the figure shows that the intensity window depended on the make of the scanner and that one intensity window was not necessarily appropriate for all image pairs. The automatically chosen intensity window method therefore has the advantages of being reproducible, conceptually simple, easy to implement and not directly relying on the semi-automatically segmented brain regions or the judgment of the image analysts. It should be noted that although the automatic intensity window is chosen to capture tissue type change between CSF and GM, it will also capture the tissue type change between CSF and WM in T1-weighted images because the intensity of WM is greater than that of GM in T1-weighted images.
Reductions of up to 30% in sample size requirements would have very material and significant benefits. Clinical trials seeking to show effects on disease progression in AD (or other neurodegenerative conditions) are large, lengthy and expensive. The reduced sample size requirements may mean that trials can be better powered and/or more cost-effective allowing more treatments to be tested and fewer patients to be exposed to possible side effects. The measurement of brain atrophy rates is relevant for a number of different diseases beyond AD. KN-BSI can provide more robust and less variable brain atrophy measurement in other diseases, such as frontotemporal dementia (Chan et al., 2001
), multiple sclerosis (Anderson et al., 2007
) and Huntington’s disease (Henley et al., 2006
). The issues related to the importance (and cost) of multi-site studies in these disorders are very similar to those encountered in trials in AD.
This study highlights the potential problems of scan acquisition changes over time. These problems may be due to operator error or scanner hardware and software changes; these are inevitable in large and lengthy multi-site clinical studies and may be very obvious or quite subtle but are nonetheless important. Furthermore, these problems extend beyond BSI and would increase the variability of the results of other image analysis algorithms or manual measurement that depend on the tissue contrast in the images. One of the strengths of this study lies in the comparison of classic-BSI and KN-BSI using a large number of images (682 images from 341 subjects) acquired on at least seven different models of scanners at multiple sites.
Interestingly, differences between the KN-BSI and classic-BSI showed wide site-specific differences with large differences implying a site had less consistent image contrast over time. The most extreme cases were images acquired at site X, which had a hardware change. This suggests that differences between the two methods may contribute to the monitoring of scanners for quality control purposes. More simply, within-subject changes in GM/WM/CSF contrast using methods such as k-means clustering could be used to help assess scanner stability or detect hardware, software or parameter changes.
In conclusion, we have demonstrated that the robustness and variability of atrophy rate measurement for large multi-site imaging studies can be improved using the KN-BSI method described in this paper. Given the increasing use of MRI outcomes in large multi-site trials, methods that can reduce the variability of these outcomes due to tissue contrast and SNR changes over time and between scanners will be increasingly valuable. However, they are not a substitute for rigorous quality control and assurance of scanners, or for attention to detail in acquiring images.