|Home | About | Journals | Submit | Contact Us | Français|
Muscle or electromyogenic (EMG) artifact poses a serious risk to inferential validity for any electroencephalography (EEG) investigation in the frequency-domain owing to its high amplitude, broad spectrum, and sensitivity to psychological processes of interest. Even weak EMG is detectable across the scalp in frequencies as low as the alpha band. Given these hazards, there is substantial interest in developing EMG correction tools. Unfortunately, most published techniques are subjected to only modest validation attempts, rendering their utility questionable. We review recent work by our laboratory quantitatively investigating the validity of two popular EMG correction techniques, one using the general linear model (GLM), the other using temporal independent component analysis (ICA). We show that intra-individual GLM-based methods represent a sensitive and specific tool for correcting on-going or induced, but not evoked (phase-locked) or source-localized, spectral changes. Preliminary work with ICA shows that it may not represent a panacea for EMG contamination, although further scrutiny is strongly warranted. We conclude by describing emerging methodological trends in this area that are likely to have substantial benefits for basic and applied EEG research.
Separating neural signals from biological artifacts is a ubiquitous problem for any neurophysiological investigation. Biological artifacts can compromise sensitivity or masquerade as genuine effects, particularly when they are confounded with statistical contrasts. When such artifacts are markedly smaller than neural signals, they can be safely ignored. When they are rare, they may be rejected with trivial consequences. Here we review methods designed to correct a particular biological artifact—muscle or electromyogenic (EMG) activity—that is neither small nor rare.
Peak cranial EMG is ~1–2 orders of magnitude larger than typical mean differences in the EEG (75–400µV vs. <10µV), indicating that even modest contamination can seriously distort inferences. More importantly, because facial EMG is sensitive to a variety of cognitive and affective processes (Tassinary, Cacioppo, and Vanman 2007), it is often temporally confounded with experimental manipulations. Consequently, EMG poses a serious risk to inferential validity for investigations of ongoing, induced or evoked EEG in the frequency-domain.
Separation of any two biological signals—in this case myogenic from neurogenic—depends on the degree to which they can be discriminated on one or more dimensions: temporal, anatomical, or spectral. Unfortunately, in addition to co-occurring in time, EMG and EEG possess broadly overlapping anatomical and spectral profiles. EMG arises from muscles across the cranium (Figure 1a). Because of volume conduction, even weak EMG (Figure 1b) is detectable across the scalp (Goncharova, McFarland, Vaughan, and Wolpaw 2003) in frequencies at least as low as the alpha (8–13Hz) band (ibid).
Furthermore, EMG exhibits less stereotypy than other biological artifacts. Ocular and cardiac artifacts, for example, arise from fixed sources and do not fundamentally differ across individuals. EMG, on the other hand, arises from the activity of spatially distributed, functionally independent muscle groups, each with a distinct spectral signature (Goncharova et al. 2003). The relative contributions of each to observed cranial EMG shows substantial variability across individuals and elicitors (Tassinary et al. 2007). These observations indicate that canonical spatial or spectral filters cannot be used to correct EMG artifact. Nevertheless, for work employing high-density arrays, excluding sensors situated at the edge of the montage from analyses may provide a minimal degree of protection from gross EMG and other kinds of artifact. Because such sensors typically provide the clearest measure of such artifacts, they should only be dropped after the application of artifact correction techniques (e.g., regression, blind source separation).
Given the clear inferential hazards posed by EMG, there is great interest in developing tools to separate myogenic from neurogenic signals. Unfortunately, most published techniques are subjected to only modest attempts at validation (McMenamin, Shackman, Maxwell, Greischar, and Davidson in press). Ideally, validation would quantitatively establish that a technique possesses a high degree of sensitivity (i.e., attenuates myogenic artifact) and specificity (i.e., preserves neurogenic signals) in a reasonably large and varied sample.
Quantitatively establishing sensitivity and specificity requires data in which the presence and absence of EMG is definitive or can be reasonably assumed. Typically, this involves either simulations, in which EMG is “injected” into otherwise artifact-free EEG (Crespo-Garcia, Atienza, and Cantero 2008; De Clercq et al. 2005; Delorme, Sejnowski, and Makeig 2007; Fitzgibbon, Powers, Pope, and Clark 2007), or scripted data, in which participants alternate tensing and relaxing in response to instructions (McMenamin et al. in press). The two methods possess complementary assumptions. The advantage of simulations is that the presence of EMG and stability of neurogenic activity is definitive. The disadvantage is that the assumptions underlying “injection” may not characterize real EMG contamination, biasing the results in favor of techniques that make similar assumptions (Grouiller et al. 2007). The advantage of scripted data is the absence of that assumption. The disadvantage is the need to assume that EMG is absent during periods of quiescence, and that neurogenic activity is consistent across periods of tensing and quiescence. A potential limitation of both techniques is their assumption that myogenic activity makes a negligible contribution to the EEG band of interest during periods of quiescence (cf. Figure 1b). Depending on the band (e.g., gamma vs. alpha) and degree of compliance with instructions, this assumption may not be tenable (Whitham et al. 2008; Whitham et al. 2007).
One tool for correcting EMG artifacts involves using variants of the GLM (e.g., regression, ANCOVA) to remove variance in a neurogenic band of interest (e.g., alpha) that is predicted by activity in an EMG (e.g., 70–80Hz) band. The advantage of this technique is that it is automatic and, by performing separate corrections at each site, can accommodate anatomical variation. Also, it does not require dedicated EMG channels. While useful for ongoing and induced EEG investigations (Lutz, Greischar, Rawlings, Ricard, and Davidson 2004), it does not allow time-series reconstruction, limiting its usefulness for event-related spectral perturbation (ERSP) analyses.
We recently tested this technique’s validity using scripted data (McMenamin et al. in press). High-density EEG data (125-channel; n=17) were acquired while neurogenic and myogenic activation was independently varied by crossing an alpha-blocking manipulation (i.e., eyes opened/closed) with low-intensity muscle activation (i.e., tensing/quiescence). Gross artifacts were rejected before correction, yielding a more realistic degree of contamination (Figure 1b). Inspection of the uncorrected data revealed that the peak and topography of the alpha-blocking contrast was markedly disturbed when changes in EMG and EEG covaried (Figure 1c).
We then examined the sensitivity and specificity of inter- and intra-individual variants of the GLM technique for correcting alpha-band neurogenic activity. Sensitivity was quantified by comparing “corrected” EMG-contaminated data to uncorrected EMG-free data in an anterior, myogenic region of interest (ROI). Equivalence tests (Seaman and Serlin 1998) were used as a follow-up. Specificity was similarly quantified using a posterior, neurogenic (i.e., alpha-blocking) ROI.
Results showed that only intra-individual correction, which models correlations between EEG and EMG bands across 1.024-s segments separately for each participant (dfObserved = number of segments - 2), showed adequate performance (Figure 1c). Unfortunately, parallel analyses on source-localized data indicated that none of the GLM-based techniques, when applied in a voxelwise manner, adequately corrected the data.
A second tool for correcting EMG artifacts exploits ICA (Delorme et al. 2007) to blindly separate each individual’s EEG into temporally independent components. Using manual or algorithmic classification (ibid; Fitzgibbon et al. 2007; Mammone and Morabito in press), some components are classified as EMG and discarded prior to reconstruction of the corrected time-series. Typically, this requires inspection of as many components as Channels × Participants, although principal components analysis or other means (Li, Adal, and Calhoun 2008) can be used to reduce dimensionality prior to ICA (Hu et al. 2005). To date, ICA has been subjected to only modest attempts, using simulated or ad hoc data, at validation. Moreover, few investigators provide detailed descriptions of their IC classification protocol or estimates of inter-rater reliability (Fatourechi, Bashashati, Ward, and Birch 2007).
Superficially, the question of classifying an IC as EMG appears trivial—it is whatever “looks like EMG” in the temporal (“fuzzy”-looking traces), anatomical (broad fringe/rim distribution; isolated channel, suggesting superficial source), and spectral (broad, high-frequency peak; accelerating/flat spectrum) domains. But in our own experience, this question is of the utmost importance (cf. Fitzgibbon et al. 2007).
To illustrate this problem, we describe an ad hoc validation test conducted on unpublished 105-channel ERSP data time-locked to 100ms face presentations. The first 40 independent components (92% variance) were manually classified as (i) EEG, (ii) “pure” EMG, (iii) a mixture, or (iv) non-EMG artifacts (Figure 2a). Correction was performed twice: first removing only non-EMG artifacts, and then removing both non-EMG and pure EMG artifacts (i.e., mixed EMG/EEG components were retained).
Because EMG was not explicitly manipulated in this experiment, validation required another means of identifying data contaminated by greater or lesser amounts of myogenic activity. Accordingly, we exploited individual differences in EMG to create two groups of participants. The high-EMG group (n=22) displayed two or more pure EMG components, whereas the low-EMG group (n=24) displayed one or none. Consistent with expectation, confirmatory analyses indicated that the high-EMG group showed more EMG contamination before ICA correction. Specifically, the cumulative variance predicted by the pure EMG components was greater for the high-EMG (Ms=0.48%, 2%) than the low-EMG group (Ms=0.19%, 0.19%), ps<.001. This effect was specific to the pure EMG components; the variance predicted by components classified as non-EMG artifact (p=.99) and mixed EMG/EEG (p=.12) did not differ. Not surprisingly, the Group × Component-Type (Pure EMG, Mixed EMG/EEG) contrast was also significant, p<.001. As a coarse test of ICA’s sensitivity, we then performed ERSP analyses comparing activity in the gamma band (25–50Hz) across groups. Notably, this revealed that the high-EMG group displayed greater power (Figures 2b, 2c) over lateral-frontal sites (~300–400ms) before and after ICA-based EMG correction.
Collectively, our results suggest that the group difference in gamma band ERSP activity following correction reflects residual EMG artifact. Two mechanisms, both entailing inadequate separation of EMG from EEG, might account for this. It could be that undercorrection reflects the decision to retain mixed EMG/EEG components, which non-significantly accounted for one-third more variance in the high-EMG (M=2.16%) than low-EMG group (M=1.63%). Alternatively, our putatively pure EEG components may have contained subtle EMG.
These observations raise several important questions. In particular, the optimal number of components to extract and classify remains unknown. This decision is likely to have a marked impact on the quality of source separation. Likewise, the appropriate classification of components containing varying ratios of neurogenic and myogenic activity, and the impact of removing such mixed-source components on sensitivity and specificity is unresolved. We (McMenamin, Shackman, Maxwell, Bachhuber, Koppenhaver, Greischar and Davidson, in preparation) are currently pursuing the answers to these questions by pairing the data and validation methods of McMenamin et al. (in press) with ICA-based source separation.
For now, our preliminary findings and other investigators’ work (Fitzgibbon et al. 2007) demonstrate that ICA does not necessarily provide adequate protection against EMG artifact. They also highlight the substantial impact that the choice of component classification and rejection protocol can have on inferential validity.
By describing the challenges of correcting EMG, we hope to stimulate the development and validation of more powerful tools. Promising techniques include using other separation algorithms (Fitzgibbon et al. 2007), incorporating more dimensions (e.g., spatial and spectral) in the decomposition, using higher statistical moments for component classification (Mammone and Morabito in press), and incorporating source anatomy (Barbati et al. 2008). Another promising direction is to validate prospective correction techniques using data obtained in the presence and absence of neuromuscular blockade (cf. Whitham et al. 2008; Whitham et al. 2007). Doing so would avoid the assumptions required by simulated and scripted datasets.
Recent years have witnessed a renaissance of interest in using scalp-recorded and source-localized EEG to answer fundamental questions about how mind arises from brain (Makeig, Debener, Onton, and Delorme 2004; Pizzagalli 2007) and exploiting this knowledge to interface brains with computers (Fatourechi et al. 2007). The development and careful validation of novel tools for separating myogenic from neurogenic signals will have substantial benefits for both endeavors.
This work was supported by NIMH grants P50-MH52354 and MH43454. We thank David Bachhuber, Bridget Kelly, and Adam Koppenhaver for assistance.