Voxel-Based Analysis and Statistical Parametric Mapping (VBA-SPM) [

1][

18] of imaging data have offered the potential to analyze structural and functional data in great spatial detail, without the need to define a priori regions of interest (ROIs). As a result, numerous studies [

7][

11][

23][

33][

48][

49][

50] during the past decade have better investigated brain structure and function in normal and diseased populations, and have enabled the quantification of spatio-temporal imaging patterns.

A fundamentally important aspect of VBA-SPM has been the spatial smoothing of images prior to analysis. Typically, Gaussian blurring of full-width-half-max (FWHM) in the range of 8–16mm is used to account for registration errors, to Gaussianize the data, and to integrate imaging signals from a region, rather than from a single voxel. The effect of this smoothing function is critical: if the kernel is too small for the task, statistical power will be lost and large numbers of false negatives will result in missing many regions that might present group differences in structure and function; if the kernel is too large, statistical power can also be lost by blurring image measurements from regions that display group differences with measurements from regions that have no group difference. In the latter case, spatial localization is also seriously compromised, as significant smoothing blurs the measurements out and often leads to false conclusions about the origin of a functional activation or of brain atrophy. Moreover, a filter that is too large, or that is not matched with the underlying group difference, will also have reduced sensitivity in detecting group differences. As a result, Gaussian smoothing is often chosen empirically, or in an ad hoc fashion, an obvious limitation of such VBA-SPM analyses, in part because of its heuristic nature, and in part because it can lead to overfitting of the data without proper cross-validation or correction for multiple comparisons.

However, the most profound limitation of Gaussian smoothing of images prior to applying the General Linear Model (GLM) [

18] is its lack of spatial adaptivity to the shape and spatial extent of a structural or functional region of interest. For example, if atrophy or functional activation in the hippocampus is to be detected, Gaussian smoothing will blur volumetric or activation measurements from the hippocampus with such measurements from surrounding tissues, including the ventricles, the fusiform gyrus, the amygdala and the white matter. Previous work in the literature [

13] has shown that spatially adaptive filtering of image data can dramatically improve statistical power to detect group differences. However, little is known about how to optimally define the shape and extent of the smoothing filter, so as to maximize the ability of VBA-SPM to detect group effects.

In this paper, we present a mathematically rigorous framework for determining the optimal spatial smoothing of structural (and potentially functional) images, prior to applying voxel-based group analysis. We consider this problem in the context of determining group differences, and we therefore restrict our experiments to voxel-wise statistical hypothesis testing. However, the mathematical formalism and algorithm are generally applicable to any type of VBA. In order to determine the optimal smoothing kernel, a regional discriminative analysis, restricted by appropriate nonnegativity constraints, is applied to a spatial neighborhood around each voxel, aiming to find the direction (in a space of dimensionality equal to the size of the neighborhood) that best highlights the difference between two groups in that neighborhood. Since each voxel belongs to a large number of such neighborhoods, each centered on one of its neighboring voxels, the group difference at each voxel is determined by a composition of all these optimal smoothing directions. Permutation tests are used to obtain the statistical significance of the resulting ODVBA maps.

This approach is akin to some fundamental principles of signal and image processing, and more specifically to the matched filtering, which states that optimal detection of a signal in the presence of noise is achieved by filtering whose kernel is related to the signal itself. In the context of voxel-based statistical analysis, the “signal” is not known, as it relates to the underlying (unknown) group difference. Therefore, the purpose of our optimization is to actually find the kernel that maximizes the signal detection.

ODVBA has some similarities with the searchlight approach [

29][

30], however it is significantly different. The searchlight method is basically a local multivariate analysis constrained to the immediate neighborhood of a voxel, whereas ODVBA performs a high-dimensional discriminative analysis using machine learning technique over large neighborhoods, which captures anatomical and functional patterns of larger range, thereby determining the optimally discriminative spatially varying filter. ODVBA also relates to the extensive literature using linear discriminant analysis (LDA) on multivariate patterns of whole brain images. However, implementing the standard LDA directly on the images usually suffers from the singularity problem [

14] because the number of images is much smaller than the number of brain voxels. To overcome the problem, Kustra and Strother [

31] used the smoothness-constrained, penalized LDA as a tool not only for a strict classification task of positron emission tomography (PEI) images, but also for extracting the activation patterns. Thomaz et al. [

45][

46] presented a PCA plus Maximum uncertainty LDA that solves the small sample size problem for classification and visual analysis of structural MRIs. Carlson et al. [

6] used PCA plus LDA to classify brain activities of different stimulus categories [

20][

40] and to find which voxels contribute to the activity. Unfortunately, although the singularity problem has been addressed in the above LDA-based methods, a great deal of important information in the images would be lost, since they employ the smoothness constraint or PCA to reduce the dimensionality prior to implementing LDA. However, ODVBA does not have the singularity problem since it never involves the matrix inverse computation and the discriminative analysis is conducted on the data set constructed according to the local neighborhood so it avoids the curse of dimensionality naturally. More importantly, all these three methods attempt to obtain discriminants or a “canonical image”, which is a spatial distribution of voxels that maximally differentiates between different experimental conditions, for interpretation of abnormality or activation in the groups. However, the resulting discriminants are then usually analyzed based on visual inspection or simply thresholding without determining a voxel-wise statistical value; therefore, they do not produce the

*p* values for each voxel in the style of traditional SPM, whereas ODVBA does. Finally, ODVBA also employs a non-negativity constraint, which is important as it prohibits canonical images with positive and negative value cancelations, which are often difficult to interpret, especially if a brain region is involved in many canonical images with different weights.

The rest of the paper is organized as follows: Section II describes the general formulation, and its numerical optimization solution. Section III introduces a method of computationally efficient implementation for the ODVBA. Section IV presents a number of experiments with 1) simulated data of known ground truth and 2) structural images of elderly individuals with Alzheimer’s disease (AD). These experiments demonstrate that the proposed methodology significantly improves both the statistical power in detecting group differences, and the accuracy with which the spatial extent of the region of interest is determined by VBA-SPM analysis. Section V contains the discussion and conclusion. For convenience, lists important notations used in the paper.

| **TABLE I**Important Notations Used In The Paper. |