3.1) MVPA results
Before relating MVPA results to symptom severity, we tested for significant classification performance in the regions of interest. Three approaches were taken to define these regions, as described in the Methods, in part because of the difficulty of using a traditional face localizer in a group characterized by face hypoactivation. In the first approach, three overlapping spheres were placed at coordinates from previous FFA studies, giving a 29-voxel cluster in the right fusiform gyrus. Both the typically-developing (M = 0.74, s.d. = 0.09) and ASD (M = 0.78, s.d. = 0.07) groups showed high face vs. house classification performance, where chance was 0.50. Permutation tests revealed that classification performance was significantly above chance for all typically-developing and ASD participants (p < 0.05 for all but one control who had a trend at p = 0.06). There were no significant group differences in classification performance (t22 = 1.24, p = 0.23).
The second approach employed a 25-voxel cluster in the right fusiform gyrus and a 29-voxel cluster in the left fusiform gyrus, reflecting significant face activation in the control group. These were not suitable regions of analysis for the control participants, but ASD participants demonstrated high classification performance in the right (M = 0.69, s.d. = 0.11) and left (M = 0.64, s.d. = 0.06) regions. Permutation testing revealed significant above-chance performance for all but one ASD individual in the right cluster (ten participants at p < 0.02, one at p = 0.05, one at p = 0.38) and all but two in the left cluster (ten participants at p < 0.02, one at p = 0.09, one at p = 0.39). The participant with the least significant result in the right cluster (p = 0. 38) and a trend in the left cluster (p = 0.09) corresponded to the left-handed ASD individual in the sample.
The final fusiform region of interest was the fusiform hypoactivation detected in the ASD group through a univariate group comparison: a 23-voxel cluster in the right fusiform gyrus and a 6-voxel cluster in the left fusiform gyrus. The control group showed high classification performance in the right (M = 0.71, s.d. = 0.08) and left (M = 0.71, s.d. = 0.08) clusters. The ASD group had lower performance in the right hypoactive cluster (M = 0.66, s.d. = 0.09) than controls, although not significantly so (t22 = 1.53, p = 0.14). The ASD group’s classification performance in the smaller left fusiform cluster (M = 0.54, s.d. = 0.04) was significantly lower than the control group’s (t22 = 6.51, p < 0.001). Permutation testing showed that classification accuracies were significantly above chance in the right and left regions for all controls (p < 0.04). In the ASD group, classification performance was significant or approaching significant for the right cluster in all but one participant (nine at p < 0.05, two at p < 0.08 - including the left-handed ASD participant, one at p = 0.18), however for the left hypoactive cluster, ten of the ASD participants’ classification accuracies were not significantly above chance (two at p < 0.03, one at p = 0.06, nine at p > 0.13). As performance in the left hypoactive region was not significant for the majority of the ASD group, possibly because of the small size of the cluster (6 voxels), we did not analyze this area further.
To assess the importance of the voxel patterns, we replaced the voxel responses with the regions’ mean activation levels at each time point, and repeated the above classifications. This replacement produced a substantial reduction in ASD classification performance for the coordinate-defined spheres (M = 0.59, s.d. = 0.07; two-tailed paired comparison: t11 = 9.43, p < 0.001), the area of control-group right face activation (M = 0.52, s.d. = 0.07; t11 = 6.99, p < 0.001), control-group left face activation (M = 0.54, s.d. = 0.06; t11 = 6.61, p < 0.001), and the right hypoactive cluster (M = 0.50, s.d. = 0.07; t11 = 5.56, p < 0.001). Control participants also experienced significant reductions in classification performance for the coordinate-defined spheres (M = 0.56, s.d. = 0.07; t11 = 5.62, p < 0.001) and right hypoactive cluster (M = 0.64, s.d. = 0.05; t11 = 3.59, p = 0.004). It is noteworthy that in the region of right hypoactivation, using mean activation values instead of the voxel patterns was particularly detrimental for ASD classification performance: the mean reduction in performance was 0.07 (s.d. = 0.07) for controls and 0.16 (s.d. = 0.10) for ASD participants, giving a significant group-difference in the size of the decrease (t10 = 2.83, p = 0.02). Using mean activation values here is conceptually similar to a typical univariate analysis, but has an advantage of producing results on the same scale as MVPA. The MVPA and mean-replaced classification results are shown in .
Classification performance within the fusiform regions of interest for control and ASD participants
We conducted face vs. house classifications in PPA clusters for each participant, for our subsequent analysis of the relationship between MVPA results and symptom severity. In a 27-voxel cluster centered in the right PPA of each individual, control (M = 0.84, s.d. = 0.07) and ASD (M = 0.87, s.d. = 0.05) participants showed high classification performance. Similarly, a 27-voxel cluster in the left PPA gave high classification performance in the control (M = 0.82, s.d. = 0.08) and ASD (M = 0.85, s.d. = 0.06) individuals. Permutation testing revealed greater-than-chance accuracy for both regions in every participant (p < 0.04). We also conducted the face vs. house classification within two visually-responsive occipital areas, a 23-voxel cluster near the calcarine sulcus of each participant, and a larger approximation of BA17, for subsequently relating performance to symptom severity. In the 23-voxel cluster, classification performance was high in control (M = 0.67, s.d. = 0.04) and ASD (M = 0.69, s.d. = 0.06) participants, with no significant difference between the groups (t22 = 0.90, p = 0.38). Permutation testing showed greater-than-chance accuracy in all participants (p < 0.05). Similarly for the BA17 region, classification performance was high in control (M = 0.77, s.d. = 0.04) and ASD (M = 0.76, s.d. = 0.04) participants, with no significant difference between the groups (t22 = 0.48, p = 0.64). Permutation testing revealed greater-than-chance accuracy in all participants (p < 0.002). We had predicted above-chance performance in these regions in advance, because of the visual differences in the stimuli, as described in the introduction. Verifying above-chance classification performance is important for the later link to symptom severity: as the two stimuli classes can be distinguished in these visually-responsive areas, any lack of a relationship with symptom severity in the next stage of the investigation cannot be because there is no relevant information in these regions.
3.2) Relationship to symptoms
We investigated the sensitivity of MVPA to individual variation in patient symptoms by examining the relationship of classification performance to standardized measures of clinical severity. We also assessed the relationship between symptoms and a univariate measure. Face vs. house classification accuracy was significantly negatively correlated with patients’ ADOS total scores for all the right fusiform regions (). These scores are a measure of severity from a structured extended interaction with the patient by an experienced clinical professional. Higher ADOS scores indicate greater severity of symptoms, such that lower classification accuracies were found in more severely affected ASD individuals. Significant negative correlations were also found between the ADOS social component sub-scores and classification accuracies in the right hypoactive and control-group right face activation clusters (). The coordinate-defined spheres relationship approached significance. The social component score of the ADI, an additional assessment of social symptom severity from rated interviews with one of the patients’ parents, was significantly related to classification performance in all three right fusiform regions. Performance in the left area of control-group face activation was not significantly related to the clinical measures, although approached significance for the ADOS total score. The regions’ mean activation values (face z-scores – house z-scores) were not significantly correlated with any measure of clinical severity. lists the statistical values for these results.
Correlations between multivariate and univariate results and clinical measures of symptom severity
Scatter plots of face vs. house classification performance against ADOS social scores of the ASD participants
Analyses of the PPA clusters showed that classification performance was not significantly correlated with ADOS total or social scores (). Classification accuracies were, however, significant correlated with the ADI social scores in the right and left PPA clusters (both at p = 0.02). Despite this latter significant result, the weak relationship between PPA classification accuracy and symptom severity as measured by the ADOS, suggests a degree of specificity for the fusiform areas. Classification performance within the 23-voxel cluster near the calcarine sulcus, and within the approximate BA17 region, was not significantly correlated with ADOS total or social scores (all p > 0.7), suggesting the significant relationships in the fusiform regions do not result from differences in basic visual processing. This is also evidence against scanner motion acting as a mediating factor in the significant relationships with symptoms: any systematically-varying scanner motion would affect other brain areas encoding stimuli differences. Although BA17 classification performance was unexpectedly significantly negatively correlated with ADI social scores (p = 0.05), the very weak correlations with ADOS scores (p = 0.78 and p = 0.83), and a weak ADI relationship with performance in the 23-voxel occipital cluster (p = 0.68), give confidence that the strong fusiform and ADOS correlations are not because of early visual processing or motion differences. Motion effects are further ruled-out by the very weak correlations between scanner movement and classification performance in all the fusiform regions (coordinate-defined spheres: r = -0.06, p = 0.84; control-group right face activation: r = -0.14, p =0.66; control-group left face activation: r = 0.00, p > 0.99; right hypoactive cluster: r = -0.07, p =0.84).
Additionally, neither age (coordinate-defined spheres: r = 0.04, p = 0.90; control-group right face activation: r = 0.13, p = 0.69; control-group left face activation: r = -0.11, p = 0.74; hypoactive cluster: r = 0.16, p = 0.63) nor IQ (coordinate-defined spheres: r = 0.37, p = 0.24; control-group right face activation: r = 0.47, p = 0.12; control-group left face activation: r = 0.18, p = 0.58; hypoactive cluster: r = 0.42, p = 0.17) were significantly correlated with performance in the fusiform regions, suggesting these variables were not driving the significant effects. Finally, we examined the signal-to-noise ratio (SNR) to ensure that the relationships between classification performance and symptom severity were not driven by a systematically lower SNR in participants with greater symptom severity. For each ASD individual, we calculated a value for the SNR by dividing the mean baseline (an estimate of the signal) by the standard deviation of the residual time series (an estimate of the noise). There were no significant relationships between symptom severity, measured through the ADOS social scores, and mean SNRs in the VT lobe (r = 0.03, p = 0.93), right control-group face activation (r = -0.36, p = 0.25), left control-group face activation (r = -0.46, p = 0.13) or right hypoactive cluster (r = -0.31, p = 0.33). The SNR in the coordinate-defined spheres was close to being significantly related to symptom severity (r = -0.57, p = 0.055), however the weak relationships for the other regions suggest that systematic differences in SNR cannot account for the MVPA – symptom severity relationships.
Scores on the Benton face recognition task, where higher scores indicate greater face recognition ability, were not significantly related to classification performance in the coordinate-defined spheres (ASD: r = 0.29, p = 0.35; controls: r = -0.32, p = 0.31) or the left cluster of control-group face activation (ASD: r = 0.46, p = 0.13), but approached significance in the right cluster of control-group face activation (ASD: r = 0.54, p = 0.07) and right hypoactive cluster (ASD: r = 0.56, p = 0.06; controls: r = -0.28, p = 0.37) for ASD participants.
Behavioral performance for the in-scan ‘same vs. different’ task with neutral faces was not significantly correlated with classification performance in the fusiform regions for ASD participants (coordinate-defined spheres: r = 0.10, p = 0.76; control-group right face activation: r = -0.23, p = 0.47; control-group left face activation: r = -0.33, p = 0.30; right hypoactive cluster: r = -0.15, p = 0.63) or controls (coordinate-defined spheres: r = -0.27, p = 0.40; right hypoactive cluster: r = 0.05, p = 0.88). It is possible that these weak correlations are due to behavioral performance approaching ceiling, although controls (face M = 0.88, s.d. = 0.07; house M = 0.96, s.d. = 0.03) showed greater task accuracy than ASD participants (face M = 0.80, s.d. = 0.09; house M = 0.92, s.d. = 0.05) for faces (t22 = 2.34, p = 0.03) and houses (t22 = 2.66, p = 0.01). It is also possible that the influence of the passive viewing task activation on the classification results dilutes a link between behavioral and classifier performance.
We also explored whether lower classification accuracies in more severely-affected participants result from greater variability in their multi-voxel face patterns, or because their face and house patterns are less discriminable (more positively correlated). To examine this, we performed within- and between- category correlation analyses in the fusiform regions with MVPA – symptom relationships. The right hypoactive region’s face patterns were significantly or close-to-significantly less correlated with average face patterns in individuals with increased symptom severity (ADOS social: r = -0.63, p = 0.03; ADI social: r = -0.64, p = 0.03; ADOS total: r = -0.54, p = 0.07). There were no significant relationships between house correlations and ADOS total or social scores, with just a trend for ADI social scores (r = -0.55, p = 0.07). The between-category correlations were more positive (reflecting less discriminable patterns) in individuals with higher ADI social scores (r = 0.71, p = 0.01; r = 0.69, p = 0.01), but this did not reach significance for the ADOS total or social scores. In the right cluster of control-group face activation, only the ADI scores gave significant relationships: increased symptom severity was associated with less correlated within-category face patterns (r = -0.63, p = 0.03) and higher between-category correlations (r = 0.78, p = 0.003; r = 0.73, p = 0.01), with a weak trend for house patterns (r = -0.52, p = 0.09). The coordinate-defined spheres had no significant symptom relationships for within- or between-category correlations, excepting between house correlations and ADOS total scores (r = -0.59, p = 0.04), although this was not significant for the ADOS or ADI social scores.
3.3) Searchlight analysis
We examined regional variation of multivariate and univariate sensitivities to symptom severity in the VT lobes using a spherical searchlight analysis. We ran the searchlight technique with multivariate (face vs. house classification accuracy) and univariate (mean activation to faces minus mean activation to houses) measures for each ASD participant. The recorded values for each sphere were correlated with the ADOS social scores. The searchlight procedure was conducted using a radius of 2, 3 and then 4 voxels to examine how the results vary with searchlight size. The strength of the relationships between the searchlight measures and symptom severity was greater overall for the multivariate measure than for mean activation, indicated by two-tailed paired t-tests on the absolute correlation coefficients of searchlights with a 2-voxel, (t970 = 3.96, p < 0.001), 3-voxel (t970 = 3.06, p = 0.002) and 4-voxel radius (t970 = 2.47, p = 0.01).
We also examined which searchlights were significantly related to symptom severity by permuting the participants’ ADOS social scores 10,000 times and computing the searchlights’ correlations for each permutation. Comparing the correlation from using the correct ADOS scores gave a map of p-values for each searchlight size, for the univariate and multivariate measures. No clusters of more than one searchlight were significantly correlated with ADOS social scores when mean activation was employed as the searchlight dependent variable, for any of the three radii (at a liberal threshold of p < 0.01). In contrast, when classification performance was employed as the searchlight measure, using a 2-voxel radius detected three searchlight clusters (of at least 2 contiguous central voxels) that were significantly related to symptom severity: a 10-voxel cluster centered in the right fusiform gyrus (center of mass: x = 39, y = -36, z = -18), a 4-voxel cluster centered in the left parahippocampal gyrus (center of mass: x = -25, y = -39, z = -13) and a 3-voxel cluster centered in the right inferior temporal gyrus (center of mass: x = 58, y = -31, z = -19). These results are shown in . Using a 3-voxel searchlight radius revealed a 4-voxel cluster in the right fusiform gyrus (center of mass: x = 41, y=-35, z = -20) that partially overlapped with the 10-voxel cluster reported in the 2-voxel radius analysis. No significant searchlights were detected with a 4-voxel radius. We also applied a cluster-based correction for multiple comparisons, although this did not produce significant results for any of the searchlight radii. Despite this null finding, the detection of these regions at a liberal threshold, using MVPA results but not the univariate measure, suggests the presence of stronger relationships between symptoms and the MVPA results, than the univariate measure employed here.
Searchlights with significant correlations between face vs. house classification performance and ADOS social scores
3.4.1) Review of aims
The primary aim of this study was to examine if MVPA measures can be sensitive to patient symptom severity. We found that classification performance, a multivariate measure of separability for the face and house fMRI patterns, was strongly related to standard measures of clinical severity in ASD participants. Specifically, in both anatomically and functionally defined clusters of right fusiform voxels, classification performance was significantly negatively correlated with symptom severity, while mean activation levels were not. This greater sensitivity of pattern analyses extended to voxels that were defined using differences in univariate measures. Assessments of the PPA region showed that this sensitivity is not a general property of VT cortex. Analyzing two occipital areas additionally confirmed that the finding did not generalize to activity patterns involved in early visual processing. A searchlight analysis across the ventral temporal lobes detected regions where classification performance was significantly related to symptom severity, which were not detected using the searchlights’ mean activation levels, although only when a liberal threshold was employed.
We have provided an example of obtaining these benefits from a functional dataset not designed with MVPA in mind. By combining multiple face conditions into one face class, we were able to utilize a large number of trials for classifier training and testing. Although the design of the experiment placed limitations on the conclusions that can be drawn from the results (discussed below), multivariate classification performance was still more sensitively related to symptom severity than the univariate measures we employed.
3.4.2) MVPA and patient groups
The findings in this study provide, to the best of our knowledge, the first evidence that MVPA functional results can reliably predict clinical symptoms. The sensitivity to clinical severity obtained using MVPA supports the idea that subtle variations in activity patterns, reflected in MVPA results, can in some cases more sensitively reflect individual variation in an area’s functional characteristics, than certain measures of mean activation. It is noteworthy that this stronger link to symptom severity was also found in a set of voxels that was defined based on a univariate statistic (i.e., a significant difference in activation values between groups). This latter finding gives confidence that the greater sensitivity does not reflect the activity patterns of voxels that are separate from those demonstrating univariate differences.
The greater sensitivity to individual differences reported here will be of interest to researchers who are involved in characterizing variation across participants in a wide range of fields. Among clinical investigators, this interest may even extend to those looking to select individuals for future interventions. As Scherf et al. (2010)
noted when discussing the failure to find a significant relationship between (univariate) fusiform gyrus activation and ADOS scores, “such predictability could have substantial implications for identifying individuals who might benefit from a behavioral intervention designed to improve face processing” (p.13). Future research will be required to establish if the sensitivity reported here extends to other patient groups, and to other regions with reduced univariate activation.
The findings reported in this study are also relevant to clinical researchers looking to make the most of existing functional datasets. The detection of regions with patterns of activity that reflect variations in patient symptoms, without a corresponding significant univariate relationship, suggests the encouraging possibility that additional regions of interest may be identifiable in previously-collected datasets. In this dataset, we found that symptom severity was related to face vs. house classification performance in a coordinate-defined area (based on face activity coordinates in prior literature), which neighbors hypoactivation in this particular group of ASD participants. A significant relationship here suggests that areas of the fusiform gyrus, without a significant difference in univariate activation, may nevertheless show activity patterns that vary systematically with symptom severity. This may be expected from an activity pattern perspective, where a significant group difference in univariate activation can be conceptualized as two very distinct activity patterns. Variations in face perception-related activity may still be present in nearby areas of cortex, even if undetectable with univariate techniques.
Linking multivariate searchlight results to individual differences gives further potential for revealing new regions of interest. In the context of this dataset, our searchlight finding of a symptom relationship in the inferior temporal gyrus (ITG) fits with several previous studies that have suggested the region may play a role in face processing in this patient group (e.g., Koshino et al., 2008
; Schultz et al, 2000
). Some univariate studies have not detected ITG involvement (e.g., Pierce et al., 2001
), giving the possibility that systematic differences in ITG activity – differences that may not always be detectable with univariate analyses – could have been present, undetected, in the functional data of such studies. Although the identified searchlight locations have backing from prior literature, the failure to detect these regions at a more stringent threshold means these results should be interpreted with some caution. Despite this caveat, the multivariate searchlight approach has the potential to highlight new regions that are functionally related to patient symptoms.
Our examination of within- and between- category correlations may be of interest to investigators exploring the basis for MVPA–symptom relationships. In this dataset, the within-participant consistency of multi-voxel face patterns was lower in individuals with greater symptom severity in the hypoactive fusiform region. This may be an underlying factor in the MVPA–symptom relationships, although our finding that greater symptom severity, when measured by the ADI, is accompanied by less discriminable face and house patterns is also suggestive. Future studies may wish to examine further the relative contributions of face pattern consistency and face / non-face discriminability, including whether the variations in multi-voxel face patterns reflect larger differences across face types (which varied by run here), or within face types.
Although in this particular context we found multivariate measures to be a strong predictor of symptoms, there may be other contexts and questions where univariate measures are more sensitive. In a recent MVPA study, Quamme et al. (2010)
reported that univariate measures were more sensitive than MVPA to task behavior, at the group level, in several of the regions they examined. As Quamme and colleagues described, the complexity of MVPA is accompanied by a vulnerability to over-fitting noise, which is less likely to occur with across-voxel averages (Quamme et al, 2010
). It is therefore very possible that univariate measures could provide a more sensitive measure of individual differences than MVPA in some circumstances. For this reason, it should not be concluded from this paper that MVPA will always be a more sensitive measure for tracking functionally-relevant individual differences, but that in certain circumstances it can be. We further note that a variety of univariate measures are available for fMRI analyses. Although we found that MVPA results were a stronger predictor of symptom severity than the univariate measure we employed (mean face activation – mean house activation), this may not apply to all univariate measures. This is additionally relevant as the MVPA and univariate measures employed here differ in the number of free parameters. Overall, we view the univariate and multivariate approaches as complementary, with each adding its own value.
Despite the findings we report here, it must be acknowledged that the design of the original fMRI study places a limit on interpretations. Specifically, the inclusion of two stimulus categories, faces and houses, limits the condition-specific conclusions that can be drawn, as discussed in the introduction. Future studies of neural differences in face processing may consider including additional stimulus classes, such as scrambled images and object categories (as in Haxby et al., 2001
; Spiridon and Kanwisher, 2002
). Assessing classifications of faces vs. non-faces, alongside the classification of different non-face categories, would further investigations of face-specific activity patterns in ASD individuals. Employing alternative processing and classification methods may also contribute additional insights (see O’Toole et al. (2007)
for a discussion of different classification approaches). Another approach for future research would be to perform MVPA within anatomically-defined ROIs. For example, ASD symptom severity should be correlated with face-related classification performance in the fusiform, but not parahippocampal, gyrus. As PPA activity can sometimes extend into the fusiform gyrus (Epstein and Kanwisher, 1998
), it may be desirable to restrict such anatomically-defined fusiform ROIs to certain sub-sections of the gyrus for face vs. house classifications, or expand the non-face classes as discussed above. Future studies may also wish to use MVPA to study activity pattern variations for different face identities in individuals with an ASD. Investigating the nature of the multi-voxel patterns generated by different faces in this group could advance our understanding of their face processing differences.
We have shown that MVPA can act as a sensitive fMRI predictor of patient symptoms. We believe this study highlights an important use of MVPA techniques for the study of autism and other clinical conditions. The application of pattern analysis techniques to patient differences is still in its infancy, but this investigation shows that the approach has the potential to measure clinically relevant patterns. Furthermore, MVPA combined with mapping techniques can identify brain regions that may not be revealed with certain univariate approaches.