|Home | About | Journals | Submit | Contact Us | Français|
Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates.
Using the multitaper method as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox.
Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors.
The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor.
In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation.
Electroencephalography (EEG), a technique for recording the electrical activity of the brain via surface electrodes, is a commonly used assay of brain activity in research and clinical settings. Well-recognized advantages of the EEG include its high temporal resolution, noninvasive nature, and ease of use. However, it is also highly sensitive to electrical activity from non-neural sources, such as eye movements, muscle activity, electrode movement, and electric fields from the environment. These sources generate signals that corrupt the underlying neural signal, and are difficult, if not impossible, to avoid.
For many research applications, and increasingly for clinical applications, spectral measures are used to analyze EEG characteristics. Since activity in specific frequency bands often has direct biological interpretations, the power spectrum is of particular interest. However, since the raw EEG signal is contaminated by non-neural sources, obtaining reliable estimates of the power spectrum that reflect underlying brain activity is not straightforward.
Computation of the power spectrum typically involves segmenting the continuous signal, applying Fourier analysis to each segment, and calculating the mean over segments of the power at each frequency. The data segments, typically of duration 1 second or more, may be determined arbitrarily (e.g., for records of spontaneous EEG), or based on events in a behavioral paradigm (e.g., for event-related potential studies). Fourier components arising from segments contaminated by typical artifacts (e.g., muscle and eye movements) are typically large relative to those of segments that only contain the neural signal, and therefore bias the mean upwards. This problem is usually addressed by removing these artifacts, by a combination of manual identification of artifact-containing segments and automated means, such as independent component analysis (ICA); however this can be labor- and time-intensive, subjective, and can discard portions of usable data.
Here we describe an alternative approach to this outlier problem, via the use of robust statistics. Specifically, we focus on the median and other quantile-based statistics. Via simulations and application to real EEG data, we show that this approach can recover the power spectrum of the underlying signal even in the presence of substantial artifact. Finally, we provide code that extends the Chronux, toolbox to carry out these computations, including the calculation of Bayesian confidence intervals.
A power spectrum is typically estimated from a measured time series by cutting the time series into segments, applying Fourier analysis to these segments, and averaging the power in each frequency bin across segments. The true value of the power spectrum is the limit of this process as the length and number of the data segments tend to infinity. However, in practice these segments are finite in length and limited in number, so power spectral estimates are necessarily biased (resulting from spectral leakage due to the finite length of the data segment) and imprecise (due to the finite number of data segments).
The multitaper method, a power-spectral estimator that we use as a starting point for our approach, tackles the tradeoff between this bias and variance in a way that is optimal for Gaussian signals. The method minimizes spectral leakage (the artifactual spreading of power from one frequency bin into its neighbors), by windowing each segment by an orthogonal set of functions, the Slepian tapers. For further background on the multitaper method see , , . Chronux is a freely available MATLAB toolbox that provides convenient implementations of the multitaper method, which we extend with an implementation of the robust approach.
The standard multitaper method consists of the following steps: (1) multiplying each data segment by each of the tapers, (2) applying Fourier analysis to these products, (3) averaging over tapers within each segment, and (4) averaging over segments. To formalize this, we denote the original signal by X(t), with B segments cut from the signal denoted as x1(t), …, xb(t), …, xB(t), each of length . These segments are non-overlapping, but need not be contiguous. We denote the K Slepian tapers by a1(t), …, ak(t), …, aK(t). (The choice of K is driven by the desired spectral resolution and data length; a common choice for 3-second-long segments, and the Chronux default, is K = 5). With this notation, the standard multitaper estimate is defined as:
We denote the power estimate for a single sample b and a single taper by Sb,k (ω):
With this notation, the standard spectral estimate takes the form
Thus, the standard multitaper estimate is a nested mean: first a mean over the K tapers within each segment to obtain the estimate , and then a mean over the segments:
Since our goal is to reduce the effect of outlier estimates from each segment, we replace the mean over segments by a robust estimator, resulting in the estimated power spectral quantity . There are many possible choices for the robust estimator – for example: an estimator based on the hth quantile, a trimmed mean, a Winsorized mean, or iterative rejection of outliers. While the present framework applies to all of these choices, estimators based on quantiles are more amenable to computation of Bayesian confidence intervals (see below), and we therefore focus on these, both in the illustrations below and in the MATLAB toolbox. We denote the estimator based on the hth quantile as . Note that h = 1/2 corresponds to the median; this is the default value in the code.
Even for Gaussian data, the median power of the tapered estimates does not equal the mean power. This is because spectral estimates are approximately distributed as chi-squared, which is positively skewed. As shown in the Appendix, we can take the skewing into account by dividing the median power by a data-independent scale factor. Furthermore, scale factors can be derived that convert not just the median (0.5 quantile), but any quantile, into mean power. The appendix details the calculation of these scale factors, which is implemented in the MATLAB module
Including this scale factor yields our main result, the robust spectral estimate:
where C(h, d, B) is the scale factor for quantile h; d is the number of degrees of freedom (d = 2K for typical frequencies, d = K for DC and the Nyquist frequency); and B, as above, is the number of segments.
Notably, the quantile is applied to the B power estimates from each segment (replacing the outer operation in Eq. ); within segments, the step of computing the mean over the tapers remains unchanged from the original method. There are two reasons for this choice: 1) if artifact is present in a segment , it is likely to affect many of the tapered estimates from that segment; and 2) the K Slepian tapers were designed to be used together to capture all of the power within a frequency bin. The toolbox supports the alternative strategies of computing either the “one-tiered” median across all tapered estimates or the “two-tiered” median across tapers and then across segments, but as we see no principled reason for this, it is not the default. The robust approach is also applicable to other spectral estimation methods, such as Welch windowing.
Standard nonparametric approaches to confidence interval estimation are based on resampling strategies, such as the jackknife or the bootstrap. These approaches are appropriate for the mean, which depends smoothly on the data – a necessary condition for the jackknife or bootstrap to be valid. However, since quantile-based estimates do not depend smoothly on the data, an alternative approach is needed.
Our approach is as follows. Let Q(ω, h) denote the true value of the hth quantile of estimates at frequency ω. We seek the probability distribution P(Q(ω, h)|data): the distribution of the true value of the hth quantile, given the observed data. To find this, we use a Bayesian approach with the conservative choice of an uninformative (flat) prior for the power spectral value. Using Bayes’ theorem, we can express P(Q(ω, h)|data) in terms of the probability of drawing the data from a distribution with known hth quantile, or P(data|Q(ω, h)):
To be consistent with our prior reasoning for implementing a two-tiered approach, we implement the Bayesian approach to confidence intervals by considering the data to be the set of spectral estimates derived from each segment (taking the mean of the tapered estimates within each segment, equation ).
We then use order statistics to compute P(data|Q(ω, h)). Specifically, we re-label each as Y1, …, YB, where Y1 is the smallest ranked value in . We also denote Y0 = −∞ and YB+1 = +∞, as this will allow us to account for the possibility of Q(ω, h) lying below the smallest ranked value or above the largest. The probability that Q(ω, h) lies between Yi and Yi+1 is equal to the probability that exactly i of the estimates are below this quantile, and B − i are above it. Since the chance of any single estimate lying below the hth quantile of the estimates is exactly h, this probability is determined by the binomial distribution:
Thus, to ensure that the probability that Q(ω, h) lies between two ordered values, Yl and Ym (where m > l) is at least 1 − α, we need to find indices and for which
We choose the intervals in descending order of probability to determine the smallest number of intervals in eq.  for a given coverage 1 − α. The union of these intervals is the desired confidence interval. Figure 1 illustrates this procedure for α = 0.05 (i.e., 95% confidence intervals).
Note that the confidence interval provided by the above procedure for a coverage 1 − α typically also applies to coverage factors somewhat larger than 1 − α. This is because the upper and lower confidence intervals are tethered to discrete values (the observations Yb), so the confidence bounds that satisfy equation  typically also satisfy it for smaller values of α as well. The relationship of the predicted coverage factor to the number of samples is shown in Figure 2A (for α = 0.05 and h = 0.5). Note also that if the number of segments is sufficiently small, then it may be necessary to include the intervals (−∞, Y1] and [YB, ∞) in order to satisfy equation . Figure 2B shows the minimum number of segments required to have finite confidence intervals, as a function of h.
For comparison purposes, we also calculated confidence intervals via the jackknife procedure, which is the standard Chronux approach. Specifically, the jackknife confidence interval is computed by pooling together all estimates from all tapered segments (for a total of BK estimates) and generating subsets of size BK − 1 by dropping one tapered segment from each. The standard or robust estimator of central tendency (mean for the standard method; quantile for robust) is then applied to each subset. The standard deviation is calculated and significance is determined according to the t-distribution with BK − 1 degrees of freedom. The provided Chronux extension includes jackknife-based confidence intervals as well as bootstrap-based confidence intervals for the robust estimators; however these are not intended for routine use, only for purposes of comparison with the Bayesian confidence intervals.
Finally, we note that the above approach to confidence limit estimation is distinct from the naïve strategy of choosing the α/2 and 1 − α/2 quantiles of the spectral estimates , rather than the ranked values whose indices are identified by eq. 8. With the naïve strategy, confidence intervals do not narrow as the amount of data increases. Moreover, the impact of outliers is not removed if the outlier fraction in either tail is at least α/2.
We applied the above procedures to (1) a synthetic signal of a known power spectral distribution corrupted by artifact, and (2) an EEG record from a human subject, with the typical artifacts of clinical recordings.
The simulated data consisted of a Gaussian signal of known power spectral distribution, to which we added a controlled amount of simulated artifact. The signal was synthesized from random-phase, Gaussian-distributed Fourier components, whose mean power was proportional to 1/ω (over the range 1/3 to 100 Hz in steps of 1/3 Hz), where ω is the frequency. That is, at each frequency ω in the above range, we set the Fourier component of the signal, , to have real and imaginary values each independently drawn from a normal distribution of variance equal to and: . We then inverted the transform to create the time-domain signal x(t) in that segment. Artifacts, which were added in the time domain, consisted of bursts of Gaussian signal with a flat (constant) power spectrum. The artifact power per unit bandwidth was 3.3 times greater than the signal power per unit bandwidth at the lowest frequency, 1/3 Hz – so the maximum signal:noise was 1/3.3. The burst length was 0.5 seconds (while samples were simulated to be 3 seconds in duration), and bursts were added to the signal x(t) at Poisson-distributed intervals, with an average of 0.25 bursts per segment. We studied the performance of the two methods for data with, and without noise added. Sample data, both with and without artifact, are shown in Figure 3. Since the power spectrum of the underlying signal was 1/ω, signal:noise decreases further with increasing frequency, from its maximum of 0.3 (at 1/3 Hz). Thus, we anticipate that at sufficiently high frequencies, the noise bursts will lead to unacceptable corruption of the spectral estimates.
The human EEG data were obtained from a healthy control subject (23 year old male). Data were recorded with an FS128 headbox and an XLTEK acquisition system (Natus Medical, Pleasanton, CA 94566) using an augmented 10/20 montage. The sampling frequency was 250 Hz. Input impedances were ≤5 KΩ. The subject was awake during testing, and generated spontaneous movement artifacts and eye blinks. Low-pass and high-pass filters of 0.01 and 100 Hz, respectively, and a 60 Hz notch filter, were applied. For analysis, 40 segments (either 1 or 3 sec in duration) were selected from a recording of 8 min: 20 of these segments, labeled “significant artifact,” contained EMG, eye-blinks, or other artifacts, as determined by visual inspection of the EEG and simultaneously-recorded video (carried out by an experienced EEG analyst). 20 other segments were labeled “minimal artifact,” as they contained little or no visible artifact. Figure 4 shows one example of a “minimal artifact” segment and two examples of “significant artifact” segments. To generate datasets with 25% artifact, we drew 5 significant-artifact segments and 15 minimal-artifact segments randomly from these two subgroups.
Human subject participation was approved by the Institutional Review Board, and was consistent with the Declaration of Helsinki.
We first compare the standard multitaper method and the robust method for simulated signals. Because the true underlying spectrum is known, this allows rigorous assessment of accuracy and the coverage factors of the estimated confidence intervals. We then apply the standard and robust methods to a sample of human EEG, and show that the robust method is less sensitive to typical EEG artifact encountered in clinical recordings.
Figure 5 compares power spectral estimates via the standard and robust multitaper methods on a simulated EEG signal. Each method was applied to two data sets that differed in the average number of noise-contaminated segments per data set: a clean dataset, and a dataset with an average of 25% artifact-containing segments, respectively (see Methods for details). For the standard method, when artifacts were present the expected high-frequency decline of the power spectrum is corrupted by the flat spectrum of the noise bursts. This shows that the artifact significantly affects the estimated spectrum. In comparison, results from the robust method shown on the right panel reveal that even over frequencies at which the power spectrum of the standard estimate is dominated by the noise bursts, the robust estimate reflects the underlying signal’s spectrum. In sum, in a data set where outliers significantly affect spectral estimates from the standard method, the estimate from the robust method can capture the underlying spectrum.
The simulated data allowed for an assessment of confidence interval estimation methods (Figure 6), and for a comparison of the Bayesian confidence intervals to the confidence intervals computed by the standard Chronux approach. The first column of the Figure shows results for the standard approach, i.e., the standard multitaper estimates with jackknife confidence intervals. As expected, when no noise is present (top panel), the spectral estimates are close to the true value, and confidence interval coverage is approximately 95%. When noise is added, the spectrum is upwardly biased by the noise, and the coverage factors drop. The third column of the figure shows that for the robust method, the confidence interval coverage determined by the Bayesian procedure described in Methods remains at approximately 95%, even when noise is added.
The middle column shows that the jackknife approach fails when there are outliers, even when the robust method is used. Specifically, when a jackknife procedure is applied to the robust method, the confidence intervals have a lower-than-veridical coverage factor, and they are also more irregular than the Bayesian confidence intervals. As mentioned in Methods, the failure of confidence intervals based on resampling is not surprising, since the median depends in a non-smooth fashion on the data. However, even under these circumstances – and when artifacts are not explicitly removed -- the Bayesian confidence intervals are approximately veridical.
Figure 7 shows that the robust method can be successfully applied to human EEG data. Figure 7A compares spectra determined by standard analysis of minimal-artifact (hand-cleaned) EEG segments with a set of EEG segments for which 25% contained significant artifact. As expected, there were substantial deviations of the spectra obtained when significant artifact was present. These included deviations at low frequencies in frontal channels, presumably due to eye movement artifact, and deviations and at high frequencies in frontal and posterior channels, presumably due to myogenic artifact. Figure 7B shows that these deviations, especially at high frequencies, are largely eliminated when the robust method is used.
Figure 7C shows that on cleaned data the robust method gives results that are very close to that of the standard method. Thus, the resistance of the robust method to corruption by artifact is not at the expense of a distortion of the result. Figure 7D demonstrates these same findings when the data are cut into 1-second segments.
Above we have shown that a simple modification of the standard approach to spectral estimation – substituting a robust estimator for the mean across segments – can substantially improve spectral estimates of EEG signals in the presence of artifact. The basic rationale is that robust estimators are insensitive to outliers, and many sources of artifact behave as outliers. When combined with the multitaper method, key advantages of the latter are retained: spectral leakage is minimized, and reliable confidence intervals can be estimated.
The proposed data-driven approach reduces the reliance on removal of artifact by other means. This has several advantages: with less preprocessing, fewer data will be discarded, potentially enabling the capture of subtle EEG dynamics. A reduced reliance on preprocessing methods also has the benefit of reducing the dependence on ad hoc or subjective methods of artifact identification, and may also accelerate the data-processing pipeline.
It is worth noting that the robust methods described here are computationally efficient. Since the MATLAB implementations of median() and quantile() (used for the power spectrum) and sort() (used for Bayesian confidence intervals) have approximately linear runtime even for 108 segments, the robust method retains the linear asymptotic runtime of the standard multitaper approach.
Because the robust method works by separating signals that are pervasive (e.g., those reflecting background state) from those that are infrequent, it may discard an intermittent signal of neural origin – such as paroxysmal activity – as artifact. The robust method works specifically because it removes outliers. Outliers include many sources of artifact but can also include neural-origin EEG activity that is infrequent.
Although this method greatly improves spectral estimates for certain data sets, it should not be treated as a panacea for all analysis-limiting noise. Since quantile estimators have a breakdown point of ≤50%, this method may not show any improvement over the standard analysis pipeline for constant or frequent noise that affects the same frequency range in most or all segments, such as 60 Hz line noise from the environment or frequent muscle tics. In these cases, alternatives such as notch filtering or artifact removal by hand must be used. We also note that the proposed approach will not remove pervasive low-level EMG, which can bias spectral estimates in the gamma range.
For clarity we tested the method here in the absence of other artifact removal techniques. However, there are benefits to other techniques for removing outliers, such as ICA, automated outlier rejection, or hand-cleaning guided by video assessment of the subject’s movements. Combining artifact removal techniques with the robust method may be more effective than either approach on its own.
While we have illustrated the method for a continuous recording of spontaneous EEG, it is also applicable to an event-related-potential (ERP) experiment. In this scenario, the data segments can be determined by the epochs or event markers themselves, and the spectra calculated as described above. However, we note that the scale factor calculation (Appendix A) assumes that the spectral estimates are distributed like chi-squared, which in turn means that there is no time-locked component, as must be the case for spontaneous EEG. If, in an ERP experiment, a time-locked component is significant because of events synchronized to the epoch boundaries, caution should be exercised: (a) the power spectrum is no longer rigorously defined, and (b) the scale factor will be frequency-dependent, and biased towards 1.
The utility of robust estimators as applied to the EEG spectrum suggests that robust methods will also be useful in the multichannel domain. In this context, robust estimators of shape, such as the minimum volume ellipse (MVE) or the minimum-covariance determinant (MCD), could be used to estimate cross-spectra, much as the quantile-based estimators here characterize power. While this specific robust approach appears to be as yet unexplored, a previous study has shown that using the median instead of the mean improves multi-taper coherence estimates. A full-fledged robust estimator of shape could provide phase information as well.
We thank Mary Conte for collecting and providing the EEG data, Tanya Nauvel for hand-cleaning the data, and Jonathan Drover for discussions on the theoretical background in this manuscript. We also thank Tanya Nauvel for her helpful comments on the manuscript and Andrew Goldfine for code testing.
Support was provided by the Tri-Institutional Training Program in Computational Biology and Medicine (via NIH training grant 1T32GM083937 to TM) and the James S. McDonnell Foundation (Nicholas D. Schiff, PI).
As mentioned in the main text, the raw quantile of a set of spectral estimates is expected to be proportional to the power, but not equal to it. The proportionality constant is dependent on the quantile h, the number of degrees of freedom d of the underlying χ2distribution describing the expected distribution of spectral estimates for Gaussian data, and the number of samples (segments), B. Here we derive this scale factor C(h, d, B) by determining the hth quantile of the tapered spectral estimates, , for a Gaussian signal of unit power.
For most frequencies ω, with the exception of ω = 0 or the Nyquist frequency, Fourier estimates are complex numbers. When ω is greater than the bandwidth of the tapers, the real and imaginary components are approximately independent and of equal variance[18, p. 360], so for the power is distributed as the sum of 2K squares of Gaussian-distributed quantities, where K is the number of tapers. For ω = 0 or the Nyquist frequency the Fourier estimates are real, so the power of the is distributed as sum of K such quantities. Therefore at a particular frequency ω is distributed as χ2/d where χ2 has d degrees of freedom, and d = K or d = 2K. The proportionality between the quantiles of this distribution and the mean is the factor that converts the quantile estimate into an estimate of power.
To compute the expected value of a quantile, we use the strategy shown in Figure 8. We first find a monotonic transformation from the uniform distribution on the interval [0,1] into the chi-squared distribution of spectral estimates. Because the transformation is monotonic, the rank-order of the samples drawn from the chi-squared distribution corresponds to the rank-order of the corresponding values in the uniform [0,1] distribution. Therefore, we can take the expected distribution of the hth quantile in the uniform distribution, and transform it back into the chi-squared distribution to determine C(h, d, B).
To determine this transformation, we note that for an arbitrary distribution with probability density function (pdf) q(x), the cumulative distribution function (cdf) s(x) is given by
This can be rewritten as
The cumulative distribution function, by definition, is uniformly distributed between 0 and 1. With , then is the desired transformation between a uniformly-distributed quantity, s, and the spectral estimates, where chi2inv(s, d) is the inverse cumulative chi-squared probability density function at s with d degrees of freedom.
We now apply order statistics to the variable s, which is uniformly distributed on [0,1]. The distribution of the (k + 1) th-ranked value for N + 1 = B draws from the uniform distribution is given by
where beta(u, v) is the beta-function, .
Transforming back to the distribution of spectral estimates, we find:
The expected value of this quantity is therefore:
When the quantile h falls exactly on a sample, i.e., when , this is C(h, d, B). When the quantile h falls between two samples, C(h, d, B) is determined by interpolating this value between two adjacent values of k. By default, the code uses the MATLAB convention for quantile interpolation, i.e., a weighted average of the values at the two adjacent values of k.
As B = N + 1 increases, the above result takes a simple asymptotic form, since the integrand factor becomes concentrated at . In this limit,
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.