|Home | About | Journals | Submit | Contact Us | Français|
The scanning protocol is a novel Brain-Computer Interface (BCI) implementation that can be controlled with sensorimotor rhythms (SMRs) of the electroencephalogram (EEG). The user views a screen that shows four choices in a linear array with one marked as target. The four choices are successively highlighted for 2.5 s each. When a target is highlighted, the user can select it by modulating the SMR. An advantage of this method is the capacity to choose among multiple choices with just one learned SMR modulation. Each of ten naive users trained for ten 30-min sessions over five weeks. User performance improved significantly (p<0.001) over the sessions and ranged from 30-80% mean accuracy of the last three sessions (chance accuracy=25%). The incidence of correct selections depended on the target position. These results suggest that, with further improvements, a scanning protocol can be effective. The ultimate goal is to expand it to a large matrix of selections.
A brain-computer interface (BCI) provides a new non-muscular channel for communication and control to people with severe motor disabilities, such as amyotrophic lateral sclerosis (ALS), spinal cord injury or brainstem stroke. The BCI translates the electrophysiological signals of the brain into an output that reflects the user's intent (Wolpaw et al., 2002). In this study, we record the brain's electrophysiological signals by scalp-recorded electroencephalography (EEG). The users are asked to control mu (8-12 Hz) or beta (18-25 Hz) rhythms over sensorimotor cortex, i.e., sensorimotor rhythms (SMRs), (Buser and Rougeul-Buser, 1999; McFarland et al., 2000, Miner, et al., 2000; Sterman, 1999) to move a cursor on a computer screen. This control can be done by changes in the rhythmic activity which occur in the form of amplitude suppression, the so-called Event-Related Desynchronization or ERD (Pfurtscheller, 1977; Pfurtscheller & Aranibar, 1977), or amplitude enhancement, the so-called Event-Related Synchronization or ERS (Pfurtscheller, 1992). ERD/ERS is frequency band specific (Lopes da Silva & Pfurtscheller, 1999) and the functional significance of the oscillatory activity is related to the underlying network as far as alpha and beta is concerned: ERD may occur due to a decrease in synchronization of the underlying neural populations which is indicated by a decrease of power in specific frequency bands or decrease of amplitudes and an increased cellular excitability in the thalamo-cortical system. Therefore ERD can be seen as an activated cortical state which is involved in the processing of information or production of motor behavior. ERS represents an increase of synchronized behavior which results in an increase of power in specific frequency bands or increase of amplitude and a decrease of excitability. In this deactivated state, active processing of information is unlikely (Neuper & Pfurtscheller, 2001; Pfurtscheller & Lopes da Silva, 1999a, 1999b). ERD/ERS can be generated intentionally by means of motor imagery which is used as the experimental strategy (Wolpaw et al., 2002).
In previous studies, users modulated features in their EEG to achieved one-dimensional (e.g., Krausz et al., 2003; McFarland et al., 2003, 2004; Mueller and Blankertz, 2006; Neuper et al., 2003; Pfurtscheller and Neuper, 2001; Wolpaw et al., 1991) or two-dimensional control (e.g. Wolpaw and McFarland, 1994, 2004) of a computer cursor in order to select targets at the edge of the screen. These studies sought to maximize the accuracy and information transfer rate of this non-muscular information channel and, more recently, to develop specific applications that can serve the needs of people with severe motor disabilities in their homes (Wolpaw et al., 2003). In the present study, we developed a scanning protocol that allowed users to select multiple alternatives by modulating a single EEG feature. To the best of our knowledge, it is the first report of a scanning protocol controlled by SMR.
This study had three primary goals: 1) to determine if users can develop sufficient control of this SMR scanning protocol to make it of practical value; 2) to determine if all four target positions were selected with the same degree of accuracy; and 3) to determine in which segment of the 2.5-s selection window the maximum control (measured as r2) is located so that the task parameters can be optimized in future versions of the protocol.
The four-choice one-dimensional scanning protocol is illustrated in Figure 1. At t=0, four square-box choices were presented on the screen. The target appeared in red and the non-targets appeared in blue (Fig. 1A). At t=1 s, the scan started at the left edge of the screen by successively highlighting each box in yellow for 2.5 sec each (without intermediate pauses) (Fig. 1B). During the scan, the user could make a selection of the target (see below). If no selection was made the scanning sequence was repeated without pause until the time-out occurred after 30 s (i.e. after a total of three full scans).
Users were instructed to relax while the scan advanced automatically and to make their selection by motor imagery when the target choice was highlighted (Fig. 1C). If the selection was correct, the target turned green for 1 s and the other choices disappeared (Fig. 1D). This was considered a “hit.” If a choice other than the target was selected, the screen immediately turned blank for 1 s. This error was considered a “false alarm.” The 1-s disappearance of the other choices after a “hit” or the blanking of the screen after a “false alarm” provided post-trial feedback to the user. Passing a target without its being selected was counted as a “miss.” Scanning continued until a hit or error occurred or until three full scans of the four selections had occurred without a selection being made (30 s total time). If three complete scans of the four choices resulted in no selection, the scan stopped and the screen went blank after the last item was highlighted. This was considered a “time out” and counted as three “misses.”
Before the start of the next trial (Fig. 1F), the screen remained blank for 1.5 s (Fig. 1E). Each trial lasted until a selection was made or for a maximum of 30 s. Each run lasted for about three minutes and consisted of 19 trials on average (range of 7-29 trials per run). Each session contained eight runs with one-minute breaks in between each run.
This study included 10 people (ages 28-66, 6 women and 4 men) who had never before used a BCI. Each user participated in an initial screening and then ten 30-min sessions of the scanning protocol. Each user completed an average of two sessions per week over a period of 4-6 weeks. One user had amyotrophic lateral sclerosis (he was in a wheelchair with limited upper-body mobility), another one had a thoracic-outlet-syndrome (he had no limitations or interferences) while the other 8 had no known motor or neurological disabilities. All gave informed consent for the study, which had been reviewed and approved by the New York State Department of Health Institutional Review Board.
The 10 users first underwent the standard BCI screening (McFarland, Miner et al., 2000). Each user sat in a reclining chair facing a 38 × 28-cm monitor at a distance of 2 m, and wore an elastic electrode cap (Electro-cap International, Blom and Anneveldt, 1982) with tin (Polich and Lawson, 1985) scalp electrodes in the 64 positions standard for EEG recording according to the modified 10-20 system (Sharbrough et al., 1991). Each electrode was referenced to the right earlobe and grounded at the right mastoid. The data were filtered (0.1-50 Hz), amplified (20,000 times), and digitized (160 Hz). The user was asked to perform several motor actions or to imagine performing them while EEG was recorded. Specifically, EEG was recorded for movement or imagined movement of left or right hand, both hands, or both feet. Each of these four actions was performed three times for two minutes with one minute breaks in between. For each user, the analysis determined the electrode position and frequency band between 8 and 28 Hz that had the highest r2 (for predicting right hand vs. left hand movements or imagery, or both hands vs. both feet). This position and this frequency were then used as the feature that controlled selection online in the scanning protocol.
The method of recording the EEG in the scanning protocol was identical to that in the screening. In the 10 scanning sessions, the users were instructed, by means of a power point presentation, how the scanning protocol should work. They were asked to stay motionless during the runs. After every session, topographical and spectral analysis of r2 values and voltage spectra were made. These analyses showed which electrode position and which frequency band exhibited the highest r2. If, over several sessions, the peak of the r2 value was at an electrode position or frequency band different from the ones at which the feature had been set, we adjusted the feature according to these analyses. Throughout the study, comprehensive frequency and topographical evaluations of r2 established that online control was being provided by actual EEG activity rather than by EMG or EOG activity (Goncharova et al., 2003; McFarland et al., 2004).
The protocol was implemented using our laboratory's general-purpose BCI software platform BCI2000 (Schalk et al., 2004), and all recorded data were stored for offline analysis. For online control, one EEG channel over left sensorimotor cortex and/or one channel over right sensorimotor cortex were derived from the digitized data according to a spatial Laplacian filter (McFarland et al., 1997). Every 50 msec, the most recent 400-msec segment from each channel was analyzed by a 16th-order autoregressive model using the Berg algorithm (Marple, 1987) to determine the amplitude (i.e., square root of power) in a 3-Hz-wide mu or beta frequency band. The amplitudes of the one or two channels were used in a linear equation that controlled selection. The used feature locations were C3, C1, Cz, C4, CP3, CP1, CP2, and CP4. The used center frequencies of the features were 8Hz, 9Hz, 10Hz, 11Hz, 12Hz, 15Hz, 16Hz, 17Hz, 18Hz, 19Hz, 20Hz, 24Hz, 25Hz, 27Hz, and 28Hz.
In all cases but one, the selection was made by reduction in feature amplitude. The exception was user J who selected by increasing feature amplitude. A selection was made if feature amplitude was under (or over for user J) a proportion of the threshold. This corresponds to an activation of motor cortex due, for example, to motor imagery. This threshold was defined as the average of the feature amplitudes for the last three 2.5-s periods of each target position in which a choice was highlighted. In the study presented here, the proportion was 0.9 in the first and 0.8 in the subsequent sessions for all users regardless of their performance.
Accuracy was calculated as the ratio between the number of correct selections and the sum of the number of correct and incorrect selections. Information transfer rate was calculated as:
(Pierce, 1980) where B is the number of bits transmitted per trial, N is the number of possible targets, and P is the a priori probability that the target is hit. The user's control of the EEG was measured as the correlation (r2) between the amplitude of the EEG signal and whether or not a given target was correct. Thus r2 is the proportion of the total variance in the EEG amplitude that is accounted for by the label of the choice (Sheikh et al. 2003).
Figure 2 displays the learning curve averaged for all users. Chance accuracy was 25%. Average accuracy increased from 35% (SD=14) accuracy in the first session, and exceeded 50% from the fifth session on. The peak was in the eighth session with 57% (SD=20) mean accuracy. Accuracy ranged from 30% to 80% mean accuracy for the last three sessions (Chance level equals 25%). With this accuracy range, information transfer rate ranged from 0.2 to 6.2 bits/min on average in the last three sessions. The best single-session performance was 91% accuracy (or 9 bits/min) in the 4th session. Accuracy and information transfer rate were highly positively correlated [r=0.987, p<0.001 (two-tailed) by Spearman bivariate correlation]. Thus, further statistical computations are reported only for accuracy.
Figure 3 shows the single accuracies across sessions for one of the best user (D), one average performer (H) and one of the worst user (B) and additional the user with ALS (J). It is clear that users varied considerably in both their rates of improvement and overall accuracy.
To assess improvement over sessions, we performed a repeated measurement analysis of variance with the session as within-subject factor and computed accuracy as a dependent variable. User performance improved significantly over the 10 sessions [F(6.24, 493.10)=17.71, p<0.001 (Greenhouse-Geisser corrected)].
The numbers of incorrect selections (i.e., “false alarms”) significantly exceeded the numbers of misses [t(99)=10.23, p<0.001 (two-tailed) by paired t-test]. This was true for every session when averaging the number of errors over all users and for every user when averaging the number of errors over all sessions (Figure 4). The frequencies of false alarms and misses tended to be negatively correlated, but this relationship did not reach significance [r=-0.179, p=0.075 (two-tailed) by Pearson bivariate correlation].
For each session of each user, we computed the correlation (measured as r2) between each choice (i.e., whether it was or was not the target) and EEG voltage across the frequency spectrum for each scalp location. The resulting r2 spectra and scalp topographies were used to guide session to session adjustments in the feature used to control selection online, and also served to verify that the user's control over the feature reflected actual SMR modulation rather than electromyographic (EMG) or other artefacts. Figure 5 shows for a single session from one user the r2 spectrum at the location used for online control (i.e., C1) and the r2 topography at the frequency used for online control (i.e., 24 Hz). Control is clearly focused in the frequency band (Fig. 5B and 5C) and at the location (Fig. 5A) used for control online. The sharp spectral and topographical foci shown here and also found in the other users indicate that this is true SMR control and not EMG artefact (Goncharova et al., 2003; McFarland et al., 2004). Across users and sessions, the r2 value of the feature used for control was strongly correlated with accuracy [r=0.936, p<0.001 (two-tailed) by Spearman bivariate correlation]. These analyses show that the users controlled the scanning protocol with SMR activity and that their degree of control of this activity determined their level of performance.
We also wanted to determine whether the four target positions in the scanning protocol were selected with equal accuracy (i.e., percentage of correct selections (%)). The percentage of correct selections for each target position was calculated as the ratio between the number of correct selections of a target position achieved by the users and the total number of possible correct selections of this same target position (i.e., how often this position was the target and thus a correct selection was possible). For every session and every user, we investigated within-subject data to compute this percentage of the correct selections. Thus, we did an analysis of variance with target position and number of sessions. As shown in Figure 6, the percentage of correct selections improved over the sessions [F(2.83, 25.45)= 4.42, p<0.05 (Greenhouse-Geisser corrected); as already described in 3.1]. There was no interaction between the target position and the number of session in the percentage of correct selections [F(5.89, 52.96)= 0.92, p=0.487 (Greenhouse-Geisser corrected)]. However, the important point is that there was a significant difference in the percentage of correct selections when comparing one target position to another [F(1.33, 11.96)=58.25, p<0.001 (Greenhouse-Geisser corrected), Bonferroni post-hoc test]. The frequency of correct selection decreased from the left-hand side of the display to the right-hand side: the first target (farthest to the left of the screen) was selected correctly most often, whereas the last target (farthest to the right of the screen) was selected correctly least often. These results were significant at p < 0.005.
To see if this effect was related to the individual user's performance, we calculated a correlation between the mean accuracy and the mean difference in the percentage of correct selections between the first and the fourth target position. This difference (measured in percentage points) was calculated by subtracting the percentage of the fourth from the first target position. The negative correlation between mean accuracy and mean difference in the percentage of correct selections between the first and the fourth target positions across sessions was highly significant [r=-0.838, p<0.005 (two- tailed) by Pearson bivariate correlation]: the higher the accuracy, the lower the difference between the first and the fourth target positions. This result demonstrates that the performance of the users had an impact on how often which target position was selected correctly.
For each user, the first 2.4 s of the 2.5-s selection window was divided into six 400-ms time segments and r2 values were computed across the last five sessions for the feature used for online control. Although on average the peak r2 value of 0.083 (SD= 0.080) occurred in the second time segment (400-800 ms) (Figure 7), this was not true for every user or for every session.
We computed an analysis of variance with the time segments as a within-subject factor to determine if there was a significant difference between the six time segments. We added accuracy of users as a between-subject factor because we wanted to evaluate whether performance was correlated with the time curve of the r2. Therefore, we divided the users into two groups: a group of less skilled performers with mean accuracy under 50% (M=38.72, SD=10.00, N=6) which had significantly less accuracy [t(8)=5.655, p<0.001 (two-tailed) by independent samples t- test] than the group of better performers with mean accuracy over 50% (M=66.50, SD=5.75, N=4).
The time segments were significantly different [F(1.32, 10.52)=9.421, p<0.01 (Greenhouse-Geisser corrected)]. As expected, the factor accuracy was significant [F(1, 8)=9.684, p<0.05]. Good performers had a significantly higher mean r2 value (M=0.097, SD=0.018) than bad performers (M=0.024, SD=0.015). However, the interaction between accuracy and time segments was not significant [F(1.32, 10.52)= 2.246, p=0.162 (Greenhouse-Geisser corrected)]. For both good and bad performers, r2 usually peaked in the 400-800 ms segment.
In a scanning protocol, the user can relax while the scan progresses automatically until the correct target is highlighted. When the correct target appears the user selects it by activation of sensorimotor cortex. A scanning protocol allows the user to choose among multiple choices using learned modulation of just one SMR feature, reducing the training time required. Because of this advantage, the ultimate goal is to expand this protocol to include a large matrix of choices, where rows and columns are scanned to select the target. The automatic scanning and the use of the same SMR modulation for every choice are a major difference between a scanning protocol and other SMR-based BCI applications, where a binary selection is necessary. For example, in the traditional tasks at the Wadsworth Center, users modulate EEG features to move a cursor to targets at the top or bottom edge of the screen (McFarland et al. 2003, 2004, 2006; Wolpaw et al., 1991). In the Graz BCI users modulate EEG features to move a cursor targets at either side of the screen (Krausz et al., 2003; Neuper et al., 2003). In the Berlin BCI, users can control the mental typewriter Hex-o-Spell through turning an arrow clockwise by employing one mental state (e.g. imagination of a right-hand movement) and making a selection by employing another mental state (e.g. imagination of a right-foot movement) (Mueller & Blankertz, 2006).
This study set out to evaluate the utility of a sensorimotor rhythm-based scanning protocol for basic communication. The results are both encouraging and sobering. On the one hand, the users did obtain statistically significant and in some cases quite impressive control of target selection. This control improved over the initial training sessions, and might well improve further with continued training. The protocol could in principle be used to select among a much larger number of choices, and thereby provide substantial practical capabilities. The accuracy of target selection was closely correlated with the control (measured by r2) of the EEG feature used for selection (Sheikh et al. 2003). Topographical and spectral analyses confirmed that this feature control reflected sensorimotor-rhythm control rather than control of non-EEG activity such as EMG.
On the other hand, the control achieved within the 10 sessions of the study varied widely both within and across users, the frequencies of both false alarms and misses were substantial, and target position had a considerable influence on performance.
The reasons for the differences in performance between the users are unknown. The group of users participating in this study were very heterogeneous. We found it very important to test the novel protocol on naïve users and the only data that could predict the users' performance was the screening which did not correlate significantly with the performance in the scanning protocol (r=0.258, p=.471 (two-tailed) by bivariate Pearson correlation, n=10) and thus might have been no good predictor. It was not possible to find any patterns in age or sex correlating with performance. Maybe the differences in performance lie within motivational or emotional factors or the use of different strategies. Whereas most of the users imagined moving a body part for selection, two very good users (C and D) had very special and even emotional imagination for relaxation and selection. User D reported imaging to starve for food while relaxing and grasping the food while selecting. User C was thinking of nice landscapes where she has already been on holidays or flowers when she was relaxing and picking the flowers when she was selecting. When she had the feeling of making too many false alarms she imagined that a bird was picking the flowers and not herself. Motivational factors could have been an issue in user H who had physiologically very good features but performed just alright and user G who was stressed out by work in these months.
The trade-off between false alarms and misses could also be related to personal factors. For example, the third very good user (A) was very competitive and reported in the 4th session (when he was reaching over 80% accuracy the first time) that in the first sessions he had problems to select the last target but now it was more of a challenge to get the first one. User (E) who was maybe already too ambitious and uneasy did not acquire a very reliable control and was most of the time selecting the first target so making lots of false alarms. In comparison another user (I) who was not ambitious and not energetic made more misses than false alarms generally in the last sessions. When user B felt more relaxed she could go to the box right before the target and was not always just selecting the first one.
While more prolonged training might well produce further improvements in performance, it is clear that changes in the format and in the translation algorithm will probably be required if the scanning protocol is to become useful for clinically practical BCI applications. Several kinds of changes are worth exploring.
Performance might be stabilized and further enhanced by incorporation of adaptive feature selection methods (McFarland et al., 2003; Krausz et al., 2003; McFarland and Wolpaw, 2003). This feature adaptation might be coupled with adaptive selection of the particular time segments used for control. Our analysis of user control as a function of time over the 2.5-s selection window showed that all time segments are not of equal value. The maximum r2 value was on average in the 400-800 ms time segment of the 2.5 s selection window. This result is consistent with the study of Wolpaw et al. (1997), who found that changes in mu or beta amplitudes have latencies of about 0.5 s. The segment of greatest value varied across users. An adaptive algorithm could select for each user the optimally weighted combination of time segments. It might also similarly optimize for each user the total duration of the selection window. An adaptive algorithm could also adjust the proportion of the threshold that was used to determine if a selection occurred so as to equate false alarms and misses.
Target position clearly had a major impact on performance. The frequency of correct selections was inversely related to target position (left to right). Thus, the percentage of correct selections was highest when the target was the first alternative and lowest when it was the last. McFarland and Wolpaw (2003) indicated that performance is better when all targets are equally accessible to the user. Our present finding that bias was less pronounced for those users with higher accuracy implies that improvement in performance should help to reduce the problem. Substantial improvement might be achieved if the translation algorithm adjusted the parameters that control selection (e.g., the threshold and time segment) as a function of target position. Such position-specific parameter adjustments might also take into account pre-selection anticipatory changes in sensorimotor rhythm activity (e.g., Bastiaansen et al., 1999). Another alternative or additional solution might be to reduce the impact of order by adopting a circular format such as that described in Mueller and Blankertz (2006).
Another way to improve performance would be to incorporate the detection of error potentials into the protocol (Ferrez and Millán, 2007; Schalk et al., 2000). Schalk et al. (2000) showed that an incorrect selection can be detected by a positive potential at the vertex that occurs about 180 ms after a mistake. The use of this potential to detect and thereby cancel false alarms (and possibly to correct misses as well) might significantly improve performance, particularly as measured by information transfer rate.
In summary, sensorimotor rhythm-based scanning protocols comparable to that used in this study are a promising option for BCI communication. This BCI method is non-invasive, uses standard EEG recording, and requires SMR control that most subjects can acquire quite readily. After further optimization as discussed here, this protocol could be a communication option of significant value for people with severe motor disabilities.
This work was supported by the National Center for Medical Rehabilitation Research, NICHD, NIH (Grant HD30146), the NIBIB/NINDS, National Institutes of Health (Grant EB00856), the James S. McDonnell Foundation, the ALS Hope Foundation, the NEC Foundation, the Altran Foundation, the Helen Hayes Hospital and the University of Graz.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.