We have for the first time demonstrated real-time BCI control based on pure covert visuospatial attention, completely independent of eye movements and evoked responses. In a telepresence application, where a robot was navigated through a course containing four targets, the user communicated the intended movement by covertly directing the attention between four different regions in the visual field. Our four subjects were all able to control the robot and they reached at least three of the four targets. All subjects expressed the feeling of having control over the robot, even during the initial practice session. This supports the notion that COVISA based BCI control is intuitive and requires virtually no training (van Gerven and Jensen 2009
; Andersson et al. 2011
; Treder et al. 2011
a). Although our study is the first demonstration of an applied BCI based on the visual system that is completely free from evoked responses and does not require eye movements, the concept of employing the visual system is not new. One example is BCI based on the steady state visually evoked potential (SSVEP). SSVEP is an evoked response present during a flickering stimulation of the retina, and is detected via an increase of power in the EEG or MEG signal at the frequency of the stimuli. The P300 is another event related potential (ERP) that has been used for BCI. This response occurs approximately 300 ms post-stimulus upon rare events. The matrix speller first described by Farwell et al. (Farwell and Donchin 1988
), is a BCI based on the P300 visual response in EEG signals. Besides being intrinsically dependent on external visual stimulation, there is growing evidence that visual P300 and SSVEP BCI systems are more or less dependent on gaze control, yielding better results if subjects direct their gaze to the target as opposed to fixating gaze elsewhere (Allison et al. 2008
; Shishkin et al. 2009
; Bianchi et al. 2010
; Brunner et al. 2010
; Treder and Blankertz 2010
For safety reasons inherent to the high magnetic field, we could not bring an eye tracker into the scanner environment (Andersson et al. 2011
). Thus, we could not get online measures of eye movements. However, it has been shown quite often that people have no trouble performing covert spatial attention shifts without any eye movements (Brefczynski and DeYoe 1999
; Siman-Tov et al. 2007
; Munneke et al. 2008
; Datta and DeYoe 2009
; van Gerven et al. 2009
; Andersson et al. 2011
). Moreover, the brain activity patterns obtained during BCI strongly suggest (Andersson et al. 2011
) that the subjects controlled the robot via covert shifting of attention, and not with eye movements. It is well known that covert shifting of attention to one side induces elevated activity in the contralateral visual cortex (Brefczynski and DeYoe 1999
; Brefczynski-Lewis et al. 2009
; Perry and Zeki 2000
). As can be seen in Fig. , the bulk of activity is contralateral for left and right attentional shifts. If eye movements were used to control the robot, we would expect opposite results, since most of the visual information would shift to the hemifield opposite to the direction of eye movement, causing activity in the visual cortex ipsilateral to that direction. Up and down shifting is associated with inferior and superior visual cortex activation, respectively. Again the activity patterns are in agreement. To classify each image volume we trained a support vector machine on the initial localizer data. The application of multivariate classification techniques on fMRI data has been shown effective in multiple studies, e.g. (LaConte et al. 2005
; Sitaram et al. 2011
). Since fMRI volumes usually include a very large number of voxels, a feature selection step is most often included to remove uninformative voxels and avoid overfitting. Our feature selection was based on an online univariate GLM analysis. A multivariate feature selection method could potentially create a map more optimized for the SVM classifier, but our strategy is fast, and it allowed us to finish the feature selection and training within a single TR. The overlap of selected voxels across sessions shows that some regions in expected parts of the cortex are consistently selected (Fig. ). Around these "hot-spots" there are voxels selected in only a few sessions. There can be several reasons for this distribution. First, visual field maps vary considerably across individuals (Dougherty et al. 2003
; Yamamoto et al. 2012
). Second, alignment of the functional data from the two different sessions and during the spatial normalization may not have been perfect, causing an apparent shift. Third, there could be small variations in where in the visual fields subjects directed their attention. They reported that they tried different strategies in order to feel confident in directing their attention. These strategies included imagining a beam of light shining from the center onto the target of interest, and pretending to expect a symbol to show up at the target. A change of strategy could potentially result in variations of selected voxels. It is also possible that the brain activation pattern changes in the course of learning to control the BCI. The current study with only three sessions does not allow an adequate assessment of this effect. We are planning a study with multiple sessions aimed at elucidating this particular topic.
Several BCI systems built on fMRI have been described (Yoo et al. 2004
; Sitaram et al. 2007
; Moench et al. 2008
; Sitaram et al. 2008
; Sorger et al. in press). These systems can for instance, as in this study, be employed for evaluating new BCI control paradigms or for determining the best choice of brain function for a specific patient population. However, the ultimate goal is to develop a BCI system that can function in every-day life for patients. Clearly MRI is then no longer an option, so implementation in a portable system is required to bring the technology to paralyzed users. Given the detailed distribution of activated brain areas it is unlikely that our results could be repeated using scalp electrodes. Instead, intracranial recordings may prove to be effective (Andersson et al. 2011
). For successful BCI control, the responses to each of the attention directions need to be distinguished reliably, not only from each other but also from visual input provided by the video feedback. As seen in the activation maps, and as predicted by retinotopic studies, multiple cortical regions corresponding to the multiple visual maps become active during each direction of attention. The brain response to the central input provided by the video camera is strong and spatially close to the attention modulated effects. Thus, the implicit limitations in terms of resolution and signal strength will probably make EEG ineffective.
Both EEG and MEG have been used for investigating covert visuospatial attention for BCI control (Kelly et al. 2005
; van Gerven and Jensen 2009
; Treder et al. 2011
a). However, none of these studies demonstrated real-time online decoding or visual feedback of the performance. It should also be noted that since MEG systems are not portable, BCI systems built on this technology can not be used in the every-day life of patients (similar to fMRI based systems). In a recent study Treder et al. (2011
b) used EEG to implement a (ERP dependent) BCI speller based on both spatial and feature (color) attention, not dependent on eye movements. They evaluated two variants of speller interfaces that were sensitive to spatial attention and one that was not. They found the best performance in the version that was not sensitive to spatial attention. For the other two variants, incorporating both spatial and feature attention, the performance dropped substantially when only using the occipital electrodes. This suggests that they did not succeed in detecting the brain response to spatial attention.
Functional near-infrared spectroscopy (fNIRS) is an optical technique that measures the localized oxygenation level in the cortex via light emitters and sensors placed on the scalp. fNIRS systems are portable and can therefore be used for BCI (see review in Matthews et al. 2008
) at home in patients’ daily life. However, this technique measures at a much lower spatial resolution than fMRI and is limited to the cortical regions close to the scalp. Thus, for the same reasons as for EEG it will be hard to separate attention towards multiple directions using fNIRS.
Intracranial recordings would most likely be suitable for COVISA BCI. These techniques can provide both high spatial resolution and give access to the higher frequencies that are too weak to be detected using scalp electrodes. The power in the gamma band (65–95 Hz) has been shown to correlate well with the BOLD signal (Lachaux et al. 2007
; Hermes et al. 2012
). Real-time fMRI can therefore facilitate BCI training and the activation pattern is likely to indicate the most reliable implant sites (Vansteensel et al. 2010
) and make it possible to limit the cortical area that needs to be covered with electrodes for decoding. In the present study the attention was maintained directed at the targets for several seconds, allowing for the BOLD effect to build up. This would no longer be necessary when classifying electrophysiological signals. Hence, for an intracranial BCI system short shifts of attention may well be sufficient.
In a recent study (Gunduz et al. 2011
) covert visual attention was studied with electrocorticography (ECoG) using a classical cueing task. Distinct foci of activity were found indicating that the associated brain signals were readily detectable. Moreover, in a previous paper (Andersson et al. 2011
) we obtained a performance of 70 % with post-hoc offline analysis of ECoG data recorded during a two-direction visual attention task.
A benefit of COVISA based BCI control is that more directions can be added to achieve a more detailed BCI control, as long as the responses can be separated. Moreover, the concept allows for optimizing the brain signals (and discrimination thereof) by adjusting the positions of the attention target regions in the visual field. In conclusion, we have shown that navigation of a robot in realtime is feasible with COVISA BCI. Given that the center video display did not interfere with the generation of movement instructions for the robot, covert shifting of attention to the periphery can be performed without interfering with processing of information in the center of the field. Conceptually, more than the current three directions can be decoded (diagonal directions or even more), but this requires further investigation.