3.1. Selected Exclusively Task-Specific ROIs for the Extraction of Feature Vectors
The exclusively task-specific ROIs for the selection of the feature vector are shown in . Since these exclusive ROIs were based on the maximum classification accuracy in each session for each subject, the corresponding optimal p-values were slightly different (). From the examination of the two sessions for all the subjects, there were degrees of variations in terms of spatial distribution and the size of the ROIs. For example, the activations of motor tasks (RH, LH, & RF) for S1 moved to the posterior part of the brain for the 2nd session as compared to the 1st session. For S4, the ROI for the left-hand imagery task appeared on the left inferior frontal and parietal areas, ipsilateral to the task, but was later detected on the contralateral motor area in the 2nd session. During the 2nd session for S5, the activations for the LH and RF tasks became less-apparent, compared to the 1st session. The IS and VI tasks were also involved in the areas where hand motor imagery tasks were activated.
The exclusively task-specific ROIs obtained from the two sessions (1st and 2nd) for five subjects (S1 ~ S5; the p-value was different depending on the outcome of optimization, and listed in ).
3.2. Results of Classification
The classification results for each subject’s session, measured by the k-fold leave-one-out CV accuracies as well as hit rates across 30 combinations of training-testing data sets, are summarized as a box-whisker plot (shown in ). Overall, the resulting CV accuracies and hit rates across all of the subjects’ sessions were qualitatively similar. The CV accuracies were 82.0% ± 10.3% for the 4-fold CV and 82.5%±10.4% for the 5-fold CV and the hit rates were 74.3% ± 14.2% for the 4-fold CV and 74.5%±14.3% for the 5-fold CV, respectively. Depending on the selection of training and testing data sets, there were variations of classification accuracies, even for the same subject in the same session. Among all 30 hit rates calculated for each session, a box represents the 25th, 50th, and 75th percentiles and whiskers denote the 10th and 90th percentiles. An average accuracy is marked with a black square dot and minimum- and maximum-accuracies are shown as ‘×’ marks. Overall, the worst average classification accuracy, observed from the 2nd session of S5, showed a hit rate of approximately 50%, which is still higher than a hit rate based on the probability of guessing (i.e. 1/6 16.7%). Four subjects, except one male subject (S5), showed higher than 80% average classification performance in at least one session, and all of the subjects showed higher than 80% in their maximum accuracies. During the 2nd session from S4, 23 sets out of 30 sets showed 100% hit rates. In general, however, the distribution of hit rates (between the 25th and 75th percentiles: shown as a box) varied about 5% to 20% across the subjects. The hit rates for each imagery task are shown in . Although the average hit rates per task were within 70~80% across the tasks, the LH task showed relatively larger variations in hit rate which is shown as a box.
Figure 6 The box and whisker plots of the leave-one-out CV accuracies (A & C) from training data (n=24) and hit rates (B & D) from testing data (n=18) for the 4- and 5-fold CV schemes (a box: 25th, 50th, and 75th percentiles, whisker: 10th and (more ...)
Figure 7 Box-whisker plots of the range of classification results, as measured by the hit rate for (A) each task pooled over all the subjects and sessions, for (B) the variable acquisition times (50-, 40-, 30-, and 20-sec excluding the dummy volumes), and for (more ...)
3.3. Classification from Simulated Reduced Acquisition Time and Spatial Resolutions
Classification accuracies for the EPI data simulating reduced acquisition time (40-, 30-, and 20-sec, excluding dummy scans) are shown in . The simulated EPI data with a data acquisition time of 40-sec did not degrade the classification performance (examined by a paired t-test; p>0.05; t-score=1.37; d.f.=299); however, a data acquisition time shorter than 30-sec began to severely reduce the hit rates for all of the subjects (a paired t-test of 50-sec vs. 30-sec: p<10−5; t-score=4.26; d.f.=299 and a paired t-test of 50-sec vs. 20-sec: p<10−32; t-score=13.53; d.f.=299). For two participants (S1 & S4), however, 80% average classification performance was maintained for simulated data sets spanning a duration as short as 30-sec. The classification accuracies examining three different voxel sizes are shown in . For all subjects, the activation maps obtained from the spatial resolution of R2 (size of voxel =7.5×7.5×5.5mm3) showed comparable classification results to the case of original spatial resolution of R1 (size of voxel=3.75×3.75×5.5mm3). However, the classification accuracies began to drastically degrade for the spatial resolution of R3 (size of voxel=15×15×11mm3) across all the subjects.
3.4. Cortical Areas of Activations as Defined by the SVM Training across the Subjects
The brain areas of activation showed large inter-subject variations. shows the FOA within the ROI which was defined by the optimized p-values while showing a maximum hit rate (). A different gray scale was applied with a FOA inside a circle (10 was the maximum number of occurrences of activation from 2 sessions across 5 subjects) for 16 representative cortical areas. The bar graphs of average FOA across the tasks for each region along with the standard deviation (whisker) are shown in . Based on this FOA analysis, several anatomical areas, including the SMA (d), the left auditory area (h), the inferior frontal cortices (i), and the cingulate gyri (l), were more commonly activated than other parts of the brain throughout the six imagery tasks.
On the other hand, the dominant trends of region-specific activation were noted depending on the characteristics of imagery tasks. For example, the left primary motor area (b) was more frequently activated for the RH, RF and IS tasks than it was for the MC and VI tasks. Superior frontal gyri (e) and dorsolateral prefrontal gyri (j), along with cingulate gyri (l), were selectively activated during the MC task (Cowell et al., 2000
) while left middle and inferior temporal gyri (m) were dominantly activated during the IS task (Shergill et al., 2004
). The superior parietal lobe (c) and occipital lobe (g) were more frequently activated for the MC (FOA≥6 in both hemispheres) and VI (FOA≥5 in both hemispheres) tasks, which indicates the involvement of visual memory (Tang et al., 2006
; Kosslyn and Thompson, 2003
). Among the motor imagery tasks (RH, LH, & RF), leftward laterality of RH and RF was observed at the primary motor area (b); however, this trend was obscured during the motor imagery tasks of the left hand (Stinear et al., 2006
3.5. Classification from the Cross-Session Test
The hit rates from the cross-session test (CST) along with the within-session test (WST) are summarized in . Firstly, the average hit rate from the WST using the normalized volumes (voxel size=4×4×4mm3; 77.6%) was comparable to that obtained from the original (unsmoothed) volumes (3.75×3.75×5.5mm3; 74.5%) shown in . Overall, the classification performance of the CST was not significantly decreased compared to that of the WST in which the average hit rates (WST: CST; %) for each subject were (92.0: 81.0) for S1, (71.0: 69.1) for S2, (76.6: 71.4) for S3, (91.0: 95.3) for S4, and (57.5: 38.1) for S5. For S5, although the average hit rate from the 1st session was 71.9%, the 2nd session resulted in a 43% average hit rate, which suggests a deterioration of the classification performance of the CST. The average hit rates along with the standard deviation of the WST and CST across all subjects’ normalized data were 77.6% ± 16.2% and 71.0% ± 20.2%, respectively.
The hit rates from the within-session test (WST) and the cross-session test (CST) for each subject. The hit rates from the WST were shown as a baseline performance of the normalized fMRI data.
3.6. Classification from the Between-Subject Test (BST)
shows the average (bar) and standard deviation (whisker) of the resulting hit rates across all six tasks (A) and for each individual task (B). The index of the four choices on the number of training subjects (i.e. one, two, three, & four subjects) is denoted as ‘a’ through ‘d’. The numbers of combinatorial selections for training and testing subjects corresponding to the choices ‘a’ through ‘d’ are 5 (=5C1), 10 (=5C2), 10 (=5C3), and 5 (=5C4), respectively. Higher hit rates were expected for a larger number of training subjects’ data since more variations of feature vectors for each task could be modeled into the classifier. Again, the overall hit rates (i.e. 68.3% ± 10.1%) were still higher than the hit rate by chance. In the results from the four subjects’ data as training data and one remaining subject’s data as testing data (i.e. ‘d’ of ), the lowest hit rate of 51.2% was obtained from the subject S5, who also showed the lowest average hit rate for both within- and cross-session tests. The variation of classification performance depending on the type of imagery task is shown in . Three motor imagery tasks (i.e. RH, LH, & RF) demonstrated consistently higher hit rates compared to the remaining non-motor imagery tasks (i.e. MC, IS, & VI). Overall, the visual imagery task (VI) showed the lowest hit rate. Information regarding computer hardware and computational time of each analytical step is summarized in .
Figure 9 From the between-subject test, the average (bar) and standard deviation (whisker) values of the hit rates obtained (A) across all the tasks, and (B) for each individual task (a: one subject’s data for training & the remaining four subjects’ (more ...)
Average computational time (h: hour; m: minute; s: second) for each analytical step (Intel Pentium D CPU 3GHz; 3.5GB RAM; Linux Slackware v12.1; MATLAB v7.1).