To assess the performance on anatomical images of Quarc vis-à-vis standard methodologies, we analyzed disease-specific e ect sizes, using Cohen’s
d, on publicly available data sets from research groups funded by ADNI (
Jack et al., 2010), downloaded from
www.loni.ucla.edu/ADNI /Data through 09/22/2010. Along with Quarc, these data sets comprise measures derived from longitudinal structural MRI processed with: (1) standard FreeSurfer v4.3 (
Dale et al., 1999; Fischl et al.,
1999,
2002;
Desikan et al., 2006;
Tosun et al., 2010); (2) Boundary Shift Integral (BSI) (
Freeborough and Fox, 1997;
Leung et al., 2010); (3) Tensor Based Morphometry (TBM) (Hua et al., 2008
a,
b,
2009,
2010); and (4) Voxel Based Morphometry (VBM) (
Ashburner and Friston, 2000;
Tzourio-Mazoyer et al., 2002;
Alexander and Chen, 2010). The measures in these data sets are for various ROIs, both pre-defined tissue regions and data-driven regions, at baseline and followup (generally 6-months apart, as described in the Methods section).
Pairwise head-to-head comparisons with Quarc were performed for each of the four other methodologies, using only measures for baseline and 12-month followup. Publicly available FreeSurfer Longitudinal (v4.4) baseline and 12-month data for any subject implicitly involve the subject’s
full available data set, up to 3 years, while baseline and 12-month data for all other methodologies involve those two time-points only; with this caveat, however, results for FreeSurfer Longitudinal are in SI. Explicit quality control (QC) information was provided for FreeSurfer, BSI, and Quarc data, and used for filtering out subject visits that did not have values as follows: FreeSurfer QVERALLQC=“Pass” or “Partial”; BSI KMNREGRATING≤3; and Quarc QCPASS=1; for TBM and VBM, QC was implicit in that only subject-visits that passed QC were publicly available. The total numbers of remaining subjects in common with Quarc for FreeSurfer, BSI, TBM, and VBM, individually, are shown in . For agiven methodology and ROI, the disease-specific effect size was defined as
also known as Glass’s, where
μAD is the average annual change in the AD cohort,
μHC is the average annual change in the healthy control (HC) cohort, and
σAD is the standard deviation in the AD cohort.
A posteriori distributions for the AD and HC means can be built from sampling Student’s
t distributions, and an
a posteriori distribution for the AD standard deviation can be built from sampling a chi-square distribution for the variance (
Rosner, 2006). An
a posteriori distribution for
d can then be built from the ratio of the
a posteriori distribution for the difference in AD and HC means, and the
a posteriori distribution for the standard deviation. The 95% confidence interval on
d can then be calculated from the cumulative distribution for
d. Effect sizes from Quarc with 95% confidence intervals for a global measure (whole brain), a cortical measure (entorhinal), and a subcortical measure (hippocampus) that are important biomarkers for AD (
Holland et al., 2009) are shown in ; numerical values are in , and results for FreeSurfer Longitudinal are in
SI Table 1. Of these three ROIs, whole brain was available for KN-BSI, and hippocampus was available for VBM; all three were available for FreeSurfer; for TBM, data were available only for the statistically-defined temporal lobe “Stat-ROI”—the optimal ROI for TBM (
Hua et al., 2010). Statistical comparisons of Cohen’s
d effect sizes were performed using 10
7 samples drawn from the
a posteriori distributions of means and standard deviations. P-values are provided in for same-ROI effect size comparisons of methodologies with Quarc; for TBM, where there was no tissue ROI in common with Quarc, the p-value is for a comparison of the Stat-ROI effect size with the Quarc whole brain effect size.