|Home | About | Journals | Submit | Contact Us | Français|
An increasing number of schizophrenia studies have been examining electroencephalography (EEG) data using time-frequency analysis, documenting illness-related abnormalities in neuronal oscillations and their synchronization, particularly in the gamma band. In this article, we review common methods of spectral decomposition of EEG, time-frequency analyses, types of measures that separately quantify magnitude and phase information from the EEG, and the influence of parameter choices on the analysis results. We then compare the degree of phase locking (ie, phase-locking factor) of the gamma band (36–50 Hz) response evoked about 50 milliseconds following the presentation of standard tones in 22 healthy controls and 21 medicated patients with schizophrenia. These tones were presented as part of an auditory oddball task performed by subjects while EEG was recorded from their scalps. The results showed prominent gamma band phase locking at frontal electrodes between 20 and 60 milliseconds following tone onset in healthy controls that was significantly reduced in patients with schizophrenia (P=.03). The finding suggests that the early-evoked gamma band response to auditory stimuli is deficiently synchronized in schizophrenia. We discuss the results in terms of pathophysiological mechanisms compromising event-related gamma phase synchrony in schizophrenia and further attempt to reconcile this finding with prior studies that failed to find this effect.
Electroencephalography (EEG) recordings have long been used to help identify sensory and cognitive deficits in individuals with schizophrenia through analyses of event-related potentials (ERPs). While the examination of ERPs have provided useful insights into the nature and timing of neuronal events that subserve sensory, perceptual, and cognitive processes, the EEG data from which ERPs are derived have received relatively little attention until fairly recently. Aside from the increasing availability of computer hardware and software for conducting computationally intensive EEG time-frequency analyses, recent interest in studying event-related EEG stems from developments in basic and systems neuroscience suggesting that neural oscillations and their synchronization represent important mechanisms for interneuronal communication and binding of information that is processed in distributed brain regions. EEG data comprise the volume-conducted summation of these neural oscillations and their synchronizations, providing an opportunity to translate what is known from basic neuroscience about the modulation of these oscillations to in vivo human EEG studies and to gain new insights into the pathophysiological processes underlying cognitive deficits and clinical symptoms in neuropsychiatric disorders such as schizophrenia. The interest in studying abnormal neural synchrony in schizophrenia, as reflected in EEG data, is also motivated by empirical findings implicating compromise of mechanisms that subserve neural oscillations and their synchronization.1–6
The principal approach to studying event-related EEG oscillations involves decomposition of the EEG signals into magnitude and phase information for each frequency present in the EEG (so-called “spectral decomposition”) and to characterize their changes over time (on a millisecond time scale) with respect to task events. Broadly speaking, this approach is referred to as “time-frequency analysis.” Time-frequency analysis comprises many methods and measures that capture different aspects of EEG magnitude and phase relationships. While some are conceptually and mathematically related, others are conceptually distinct and complementary in terms of the information they provide about neural activity. As the EEG or magnetoencephalography (MEG) data describing abnormal neurooscillatory activity in schizophrenia begins to proliferate,6 it is important to maintain an appreciation of the distinctions among the measures used and their mathematical underpinnings.
Accordingly, in the present article, we provide an overview of EEG time-frequency analysis, including a discussion of the information it provides relative to traditional ERP analysis, a review of some of the major analytic approaches to spectral decomposition of EEG, and an emphasis on the conceptual differences among the measures that are commonly associated with the concept of “neural synchrony.” In addition, we present a time-frequency analysis of EEG data from a simple auditory oddball task in healthy control subjects to illustrate the impact of different parameter choices on the resulting time-frequency decomposition, focusing on the 50-millisecond poststimulus gamma band response elicited by standard tones. This is followed by an analysis that directly compares healthy controls and patients with schizophrenia on this gamma band response.
A useful departure point for this discussion is to contrast ERP analysis with modern event-related time-frequency analysis of EEG. ERPs are systematic positive or negative voltage deflections evident in the averages of EEG epochs time-locked to a class of repeated stimulus or response events. As a result of averaging across a large number of epochs, the “random” activity in the EEG cancels out, approaching zero as the number of trials increases. The waves that survive this averaging process, known as ERP components, reflect deviations from a preevent baseline, and their peak amplitudes and latencies are thought to index discrete sensory and cognitive processes that unfold over time in response to a class of events. The traditional view of ERPs, sometimes referred to as the additive ERP model,7 assumes that ERP components reflect transient bursts of neuronal activity, time locked to the eliciting event, that arise from one or more neural generators subserving specific sensory and cognitive operations during information processing. In this view, ERPs are superimposed on, and imbedded in, ongoing background EEG “noise” with amplitude and phase distributions that are completely unrelated to processing of the task events.
This view of ERPs has been challenged on at least 2 counts. First, time-frequency analysis of single-trial EEG epochs reveals that EEG does not simply reflect random background noise; rather, there are event-related changes in the magnitude and phase of EEG oscillations at specific frequencies that support their role in the event's processing.7 Second, ERPs themselves may represent transient phase resetting of ongoing EEG by experimental events, leading to transient time- and phase locking of frequency-specific oscillations with respect to an event's onset on trial after trial.7,8 These phase-synchronized oscillations survive cross-trial averaging and are evident as waves in the average ERP. A related alternative is that ERPs result from event-related partial phase resetting of ongoing oscillatory activity along with transient increases in the magnitude of oscillations that are time-locked to the experimental events.7,9
Makeig et al.7 who have been at the forefront of challenging the traditional additive model of ERPs, have developed an overarching approach for analysis of event-related EEG data that they call “event-related brain dynamics.” This approach emphasizes the spectral decomposition of single-trial event-related EEG epochs in order to separately examine event-related changes in the magnitude and phase of oscillations at specific frequencies. The approach also includes examination of strategically sorted single trials of EEG in graphical form (called ERP images) in order to reveal systematic relationships between event-related amplitude changes and other characteristics of the trials (eg, reaction times, phase angles at specific frequencies).10 In this way, the approach provides a more refined and detailed account of the brain's event-related neurooscillatory activity, relative to the more static view provided by traditional ERP approach. Nonetheless, it is important to note that despite the richness of the information provided by time-frequency analyses, they are not able to unambiguously differentiate between the alternative models of ERP generation discussed above, as was elegantly demonstrated by Yeung et al.11,12
While ERPs and time-frequency analysis of EEG both provide a view of the serial or sequential events in the brain's information processing stream, an increment provided by time-frequency analysis of EEG, relative to ERPs, is its potential to view the brain's parallel processing of information, with oscillations at various frequencies reflecting multiple neural processes co-occurring and interacting4 in the service of integrative and dynamically adaptive information processing. This incremental benefit of EEG time-frequency analysis, relative to ERPs, may also be manifested in greater sensitivity to the true nature of the neuropathophysiological processes underlying schizophrenia. For example, we have recently shown with EEG data from a simple oddball paradigm that both phase and power measures are more sensitive to schizophrenia than traditional ERP components such as the P300.13
When a large number of parallel-oriented cortical neurons receive the same repetitive synaptic input and/or generate the same repetitive sequence of outputs, their synchronous activity produces extracellular rhythmic field potentials. These open electrical fields are propagated or “volume conducted” throughout the body, dropping off with increasing distance from the source. Accordingly, stronger fields propagate further than weaker fields. These open rhythmic field potentials can be recorded as EEG from the scalp if they are strong enough and have the right orientation (ie, perpendicular or radially oriented fields with respect to the scalp surface produce stronger scalp potentials than parallel or tangentially oriented fields).14,15 Thus, if the neural activity recorded by scalp EEG electrodes were not already synchronized and not already powerful, it would not be evident at the scalp. Therefore, even before it is spectrally decomposed, EEG at the scalp is prima facie evidence of neural synchrony of cortical activity.
Time-frequency analyses of EEG provide additional information about neural synchrony not apparent in the ongoing EEG. They can tell us which frequencies have the most power at specific points in time and space and how their phase angles synchronize across time and space. Because EEG rhythms are themselves the product of synchronized activity among and within neuronal assemblies, it is often assumed that changes in EEG power reflect underlying changes in neuronal synchrony, as exemplified by the use of the terms “event-related synchronization” or “event-related desynchronization” to describe event-related changes in EEG power.16,17 However, it is not actually possible to know whether changes in EEG power reflect changes in the magnitude of the rhythmic field potentials or changes in their degree of synchronization. Nonetheless, using time-frequency analyses, we can assess changes in power and synchronization of EEG on a higher order, within or between spatial locations across trials with respect to the onset of task events. The methods providing these distinctions are described below.
EEG is traditionally modeled as a series of sine waves of different frequencies overlapping in time and with different phase angles with respect to a stimulus. A sine wave (figure 1A) is defined in terms of its frequency, its magnitude, and its phase. The frequency of a sine wave refers to the number of complete cycles or oscillations within a 1-second time period and has the units of Hertz (Hz=cycles per second). The magnitude refers to the maximum height of the sine wave's peaks (or valleys) with respect to the x-axis. The phase refers to where specific time points fall within a cycle of the sine wave, ranging from −180° to 180° or, when expressed in radians, ranging from −π to π. These concepts are illustrated in figure 1A for a 10-Hz sine wave. This oscillation over time describes the signal in the “time domain,” but the signal can also be represented in the “frequency domain” by means of a spectral decomposition that extracts a complex number for one or more frequencies. In time-frequency decompositions, a complex number is estimated for each time point in the time-domain signal, yielding both time and frequency domain information.
Complex numbers comprise both real and imaginary components that can be plotted on a 2-dimensional graph with the x-axis representing the real component (r) and the y-axis representing the imaginary component (i) (see figure 1B). If a line is drawn from the origin of this graph to the complex data point in the x-y (ie, r-i) plane, 2 characteristics of the sine wave are defined for the specific time point being evaluated: a magnitude value and a phase angle (θ). The magnitude is equal to the length of the line (or “magnitude vector”) connecting the origin (0, 0) to the complex data point (r, i) and is related to the amplitude, which is the square root of the power, of the sine wave at that time point (The magnitude is obtained by applying the Pythagorean theorem (a2 + b2 = c2), where the real (a) and imaginary (b) values are 2 legs of a right triangle. The magnitude is the hypotenuse (c). In this way, the magnitude for any time, frequency, electrode, and trial can be calculated.). The phase angle is equal to the angle formed by the magnitude vector and the x-axis and ranges in value from −180° to 180°. The complex numbers for 4 time points in the sine wave shown in figure 1A are graphed as vectors depicting their magnitudes and phase angles in figure 1B.
Just as described for this sine wave example, figure 2 shows that single-trial EEG epochs (figure 2A) can be spectrally decomposed into complex numbers for each EEG time point, providing estimates of the magnitude and phase angles of the oscillations (figure 2C) at any given frequency. This is accomplished through multiplication of the EEG with a windowed transformation function (eg, Morlet wavelet transform, as shown in figure 2B) centered on a segment of the EEG epoch, an operation known as “convolution” that can be defined as the multiplication of one series or “vector” of numbers by another.18 By sliding this windowed function across the EEG time series one point at a time, a complex number at the window's center point is estimated for each time point in the EEG (figure 2C). When this is done for each trial, the complex number values for a specific time point relative to an event's onset (eg, stimulus onset) are collected across trials (figure 3A). At this point, it is possible to independently isolate the magnitude or phase information derived from these complex numbers. Thus, the magnitude length of the complex number vectors can be extracted, squared, and averaged (figure 3B), yielding the mean power for a given frequency at a particular time point (figure 3D and 3F; see “Power Across Trials” section below). Likewise, when each complex data point is divided by its corresponding magnitude, a new series of complex data points are generated where the phase angles are preserved, but the magnitudes are transformed to one (ie, unit normalized) (figure 3C). These magnitude-normalized complex values can then be averaged, yielding a measure of the cross-trial phase synchrony for a particular frequency at a particular time point (figure 3E and 3G; see “Phase Synchrony Between Trials” section below). In short, once the distinct magnitude and phase characteristics of EEG oscillations have been extracted, they can be quantified in a variety of ways to elucidate different aspects of dynamic brain function and neural synchrony. A survey of these quantification approaches is presented below.
There are many approaches to time-frequency decomposition of EEG data, including the short-term Fourier transform (STFT),19 continuous20,21 or discrete22 wavelet transforms, Hilbert transform23, and matching pursuits.24 A comprehensive survey of time-frequency decomposition methods is beyond the scope of this article, but some basic points about time-frequency transformations can be made that highlight differences among some of the methods and also underscore some more general considerations. Perhaps the most important overarching principle is that all time-frequency decomposition methods strike some compromise between temporal resolution and frequency resolution in resolving the EEG signals. In general, the larger the time window used to estimate the complex data for a given time point, the greater the frequency resolution but the poorer the temporal resolution. This trade-off between precision in the time domain vs the frequency domain is formalized in the Heisenberg uncertainty principle,25 discussed again in a later section.
A variant of the fast Fourier transform (FFT), known as the STFT, or windowed Fourier transform19 performs a Fourier transform within a time window that is moved along the time series in order to characterize changes in power and phase of EEG signals over time. Typically, a fixed duration time window is applied to all frequencies. The choice of time window constrains the frequency bin size (ie, frequency resolution), which is uniform across all frequencies, and also determines the lowest resolvable frequency. The uniformity of the time window across frequencies is a limitation of this approach because optimal characterization of temporal changes in high-frequency signals requires shorter time windows than those needed to optimally characterize low-frequency signals. A more flexible approach in which window size varies across frequencies to optimize temporal resolution of different frequencies is therefore desirable. Wavelet analysis provides such an approach.
Continuous wavelet transforms describe a class of spectral decomposition methods that are conceptually related to the windowed short-term Fourier analysis described above. Wavelets are waveforms of limited duration that have an average value of zero. While any number of waveforms can be considered a wavelet, to be useful in modeling biological signals such as EEG, the waveforms contained in the wavelet must provide a biologically plausible fit to the signal being modeled. One common type of biologically plausible wavelet, the Morlet wavelet, is a Gaussian-windowed (see below) sinusoidal wave segment comprising several cycles (figure 2B). A family of wavelets, comprising compressed and stretched versions of the “mother wavelet” to fit each frequency to be extracted from the EEG, is traditionally constrained to contain the same number of cycles across frequencies. As a result, wavelet analyses utilize a different time window length for each frequency, with the longest windows applied to the lowest frequencies and the shortest windows applied to the highest frequencies. For example, assuming a wavelet family contains 6 cycles of a sinusoidal oscillation, the wavelet for the 10-Hz frequency spans a time window of 600 milliseconds, whereas the wavelet for the 40 Hz frequency spans a time window of 150 milliseconds. This variation in the wavelet from coarser to finer temporal resolution with increasing frequency is achieved at the cost of diminishing frequency resolution as frequency increases.
The sinusoidal waves contained in a wavelet are typically shaped by an envelope function (eg, a Gaussian function), such that the wavelet has its largest magnitude at the center time point and tapers off toward the edges of the time window. Wavelets used in spectral decomposition are complex, containing both real and imaginary sinusoids (see figure 2B). Each wavelet in a wavelet family is convolved with the time series of EEG data, sliding the wavelet time window across the time series, yielding a separate time series of complex wavelet coefficients for each frequency. These complex coefficients, containing both real and imaginary components, are used to derive a magnitude and phase angle (figure 2C).
Contrasting the 2 methods, for high-frequency signals, it is often assumed that the Morlet wavelet decomposition provides greater temporal resolution but poorer frequency resolution than the STFT. However, while these differences may be evident using the typical default settings for each method, parameters can be adjusted across frequencies in both methods such that they converge on the same resolution.26 Specifically, a modified STFT can use a time window that decreases linearly as frequency increases, rather than a fixed time window, as is implemented in EEGLAB software (http://sccn.ucsd.edu/eeglab/). Similarly, a modified wavelet approach can linearly increase the number of cycles used as frequency increases, rather than using a fixed number of cycles, as is implemented in Fieldtrip software (http://www.ru.nl/fcdonders/fieldtrip/). More generally, by exercising this kind of flexibility in the parameter settings for any given time-frequency decomposition method, many of the methods used can be shown to converge on the same results.26–28 Both EEGLAB and Fieldtrip toolboxes are open source, free utilities that run within Matlab and implement many of the measures of synchrony mentioned.
Regardless of the decomposition routine and associated analysis parameters selected, the output of spectral decomposition analysis is a complex data point, consisting of real and imaginary parts (figure 2C), for every point in time and for each frequency, for each trial, and for each electrode evaluated. These complex data are the launching point for calculating numerous measures that appear in the research literature describing the spectral characteristics of EEG, MEG, or intracranial electrocorticography data.
Power is calculated by squaring the magnitude (or length) of the vector defined by plotting the complex number coordinates, obtained from spectral decomposition of an EEG time series, on the 2-dimensional, real-imaginary, x-y plane. As such, it reflects the magnitude of the neuroelectric oscillations at specific frequencies. Approaches to calculation of power depend on the assumptions made about the stability of the EEG signal during the time window of interest, as well as the assumptions about the consistency of the phase angles of the oscillations across trials.
When EEG oscillations are assumed to be stable or “stationary” over time, the FFT is often used to spectrally decompose this (usually extended) period of time-invariant EEG. This is done, for example, with resting EEG (also known as quantitative EEG29) or with steady-state paradigms in which a stimulus is continuously repeated at a fixed frequency for an extended time period, driving the EEG at that specific frequency.30,31 The result is a single power spectrum that captures the average magnitude of oscillations for individual frequency bins integrated over the entire time period analyzed. The frequency resolution is determined by the rate at which the EEG time series was digitized and hence the total number of time points contained in the time window.
When EEG activity cannot be assumed to be stable over the time period of interest, as when it reflects the unfolding sensory, perceptual, and cognitive stages of information processing initiated by an event, the various methods of time-frequency decomposition described above are applied. These methods characterize event-related changes in power, relative to a pre-event baseline period, in EEG epochs time locked to task events such as stimulus presentations or responses. When the magnitude values are squared for each time-frequency data point and then averaged over trials the result is a 2-dimensional matrix containing total power of the EEG at each frequency and time point. Total power captures the magnitude of the oscillations irrespective of their phase angles. As such, it comprises 2 major sources of event-related oscillatory power, evoked power and induced power.
Evoked power refers to event-related changes in EEG power that are phase-locked with respect to the event onset across trials. The phase-synchronized oscillations in the EEG across trials are isolated by first time domain averaging the event-locked EEG epochs to derive the ERP. Frequencies that are phase synchronized with respect to stimulus onset across repeated trials survive the averaging process and can be seen in the average ERP. This is not the case for oscillations that are out of phase with respect to stimulus onset across trials, which cancel out toward zero during the averaging used to generate ERPs. Accordingly, evoked power is calculated by spectral decomposition of an individual's ERP, squaring the magnitude values associated with each time and frequency point in the time-frequency matrix. Evoked power in specific frequencies, such as the gamma band, have been linked to sensory registration32–34 as well as to top-down cognitive processing35,36 of stimulus events and generally occur within the first 200 milliseconds following stimulus onset.
Induced power refers to event-related changes in EEG power that are time-locked, but not phase-locked, with respect to the event onset across trials. Induced power, also known as “asynchronous power” or “phase-invariant power,” is contained within, and is sometimes confused with, total power because the latter is calculated from time-frequency decomposition of single-trial EEG epochs using only the squared magnitude information without regard to the phase of the signal.37 Similar measures are referred to as event-related desynchronization or synchronization16,17 or time-varying energy.38 When implemented in EEGLAB,10 total power is referred to as “event-related spectral perturbation.” Although these variously named total power measures are considered to be insensitive to stimulus-evoked phase locking, unless evoked power is explicitly removed from measures of total power, they all actually contain both phase-locked and non–phase-locked power. This may be particularly true in the lower frequencies such as the delta and theta bands, where phase-locked ERPs may manifest as increases in total power. For example, Makeig39 noted that the total power peak in the theta band following auditory stimuli overlaps in both time and frequency with ERP peaks, supporting the idea that the early peak in theta power likely contains the energy from the phase-locked ERP as well as contributions from trial-to-trial ERP variance. Thus, to isolate pure induced power, evoked power must be removed from the single trial–based total power estimate. Unfortunately, there is not agreement in the field as to how, or whether, this subtraction should be performed.39–42
Interest in induced power stems from early work by Gray and Singer43 involving recordings of multiunit activity and local field potentials from cat visual cortex during visual stimulus processing. This work showed that interneuronal synchronization occurred on each trial, but the latency of this synchronization with respect to stimulus onset was variable across trials. This set the stage for subsequent observations that perceptual processes, such as the binding of disparate stimulus features to form a percept,44 and cognitive control processes, such as working memory45 or preparation to overcome a prepotent response tendency,46 are often associated with phase-asynchronous power changes that nonetheless occur in approximately the same latency windows across trials with respect to stimulus onset.
In addition to evoked and induced power, other potential contributions to total power come from background and spontaneous EEG power.40,44,47 However, if these EEG power signals are not event related, they are removed in the baseline correction and trial-averaging processes, respectively. The comparison of figure 3D and 3F illustrates the importance of baseline correction to detect stimulus-related changes in total power.
Event-related phase consistency, or phase locking with respect to an event's onset, across trials can be calculated within one electrode, complementing the total power measure described above. To this end, we use the phase information shown in figure 3C in which magnitude information has been unit normalized (ie, transformed to 1). By averaging these normalized complex numbers across trials for each time point and frequency bin, a 2-dimensional matrix of time-frequency values describing the consistency of the phase angles with respect to an event's onset is obtained. Specifically, each value in this time-frequency matrix is a real number (figure 3E) between zero and one, with zero reflecting a completely uniform random distribution of phase angles between trials and with one reflecting identical, or perfectly synchronized, phase angles across trials. The measure defined by these values has been called phase-locking factor (PLF)38 or intertrial (phase) coherence (ITPC),10 and it represents one minus the circular variance of phases (ie, phase variance48) for each time-frequency point examined. Event-related phase locking is an important complement to total power because the complex number magnitude values on which power calculations depend have no influence on the phase angles used to calculate phase locking.
The term “PLF” is unfortunately very similar to one of the terms commonly used to describe the consistency of phase differences between 2 electrodes across trials, “phase-locking value.”49 One of the challenges for the field is to adopt consistent terminology that more sharply distinguishes among the various types of synchrony measures. Traditionally, the term “coherence” was used in EEG to describe the consistency of the signals between 2 electrodes. Accordingly, to enhance the clarity of our presentation, we adopt the term PLF,38 rather than “ITPC,”10 to refer to event-related phase consistency across trials within a single electrode, reserving the term “coherence” for various measures of consistency of the signals recorded from 2 channels (eg, electrodes, MEG sensors, underlying regional brain sources) across trials.
Because of the multichannel nature of EEG recordings, based on arrays of electrodes sampling signals across the scalp, there is a long tradition of analyzing the consistency between the EEG from pairs of electrodes in an attempt to address the brain's regional connectivity and interregional interaction.50 The traditional approach to characterizing the consistency of the EEG signals from 2 channels across trials involves calculating, for each frequency, the linear relationship between the 2 complex signals derived from spectral decomposition of the EEG, in a manner analogous to the Pearson product-moment-correlation coefficient. When this EEG consistency measure is a complex coefficient, retaining both real and imaginary components, it is known as “coherency,” whereas when the measure is based on isolating the magnitude information from coherency, it is known as “coherence.”51 To enhance insight into the nature of coherency and coherence coefficients, we start by recounting the equation for the simple Pearson correlation coefficient. The Pearson correlation between 2 paired variables, x and y, can be defined as the standardized covariance of x and y, with standardization achieved by dividing the covariance by the product of the SDs of x and y:
Coherency is similarly defined as the standardized cross-spectrum of complex signals X and Y across trials, derived from spectral decompositions of the time series (t) for a given frequency (f), with standardization achieved by dividing the cross-spectrum by the square root of the product of the power spectrum of X and the power spectrum of Y. The cross-spectrum, analogous to the covariance in the Pearson correlation equation, is defined as the expected value (over trials) of the product of the complex signal X and the complex conjugate (The complex conjugate of any complex number can be represented on the real-imaginary, x-y, plane as a reflection [ie, a mirror flip] across the real axis [eg, the complex conjugate of 2 + 3i = 2–3i].) (denoted by *) of the complex signal Y:
The power spectrum of signal X at a given frequency and time across trials, analogous to the variance of x in the Pearson correlation formula, is equivalent to the cross-spectrum of X with itself and is defined as
Accordingly, coherency is defined as
Of note, the product of a complex number and its complex conjugate yields a real number (ie, the squared magnitude), whereas the product of 2 different complex numbers, as is usually the case for the cross-spectrum of X and Y (SXY), yields a complex number. Therefore, the power spectra are real numbers, but the cross-spectrum, as well as coherency itself, is a complex number. As complex numbers, the cross-spectrum or the coherency can be expressed in terms of their “cross-magnitude” (ie, square root of the “cross-power”) and their “relative phase” (ie, the average phase difference between the 2 channels across trials).
When the magnitude of coherency is isolated by taking its absolute value (The absolute value of a complex number is equal to its magnitude, with phase information dropping out.), the resulting coefficient is referred to as “coherence,” which is a real, rather than a complex, number.51 Unfortunately, terminology in the literature is not used consistently, so in some descriptions10,14 the term “coherence” is used to refer to the magnitude of coherency squared. To minimize confusion, we refer to this quantity as “magnitude squared coherence” (MS coherence), which is defined as the squared absolute value (ie, magnitude) of the cross-spectrum divided by the product of the power spectra of X and Y.
MS coherence is analogous to the squared Pearson correlation, r2, which is the squared covariance divided by the product of the variance of x and the variance of y. Just as r2 describes the proportion of variance in y accounted for by a linear transformation of x, the magnitude squared coherence reflects the proportion of variance of channel X at frequency (f) that can be accounted for by a constant linear transformation of the complex spectral coefficients derived from channel Y.14 As with other correlation coefficients, the traditional coherence measure has a skewed sampling distribution that is typically normalized using a Fisher z transform.52
MS coherence has combined sensitivity to both magnitude and phase synchrony between 2 channels. However, MS coherence is more influenced by phase relationships than magnitude relationships in that it approaches zero when the distribution of the phase differences between channels is randomly uniform across trials, whereas consistent phase differences and unrelated cross-channel magnitudes across trials yields non-zero MS coherence values. In effect, MS coherence is a measure of the consistency of phase differences between 2 channels weighted by the product of their respective magnitudes. This has been regarded as a strength of the traditional coherence measure, in that less weight is given to phase differences when the magnitudes of the signals are weak.14 Nonetheless, the sensitivity of MS coherence to both magnitude and phase relationships has also been regarded as a limitation of the measure because the partial confounding of magnitude and phase makes its interpretation somewhat ambiguous. Moreover, the changes induced in the magnitude and phase of neurooscillations by events are dissociable, both in theory7 and empirically8, and are likely subserved by distinct neurobiological mechanisms (eg, Pinto et al53). These considerations have motivated the development of measures that separately quantify the consistency of cross-trial phase differences between 2 channels as well as the correlation of their magnitudes across trials.
When the magnitudes of the frequency- and time-specific complex numbers, derived from spectral decomposition of single-trial EEG epochs, are unit normalized (ie, set to 1) prior to calculation of coherency, the resulting coherency estimate provides a measure of the consistency of phase differences between the channels across trials, unweighted by magnitude. This special case of coherency is known by different names in the literature, including “phase coherence,”14,51 “phase consistency,”26 “phase-locking value,”49,54 “phase synchronization,”55 or simply “phase synchrony.”27 Because we adopted the convention of reserving the term “coherence” to refer to between channel consistency of EEG, we will use the term “phase coherence” to refer to this measure of consistency of phase differences between channels.
The magnitude-normalized complex data shown in figure 3C, which were used to calculate PLF, are also used to calculate the phase coherence, except that such data are obtained from 2 recording sites. Next, the difference between the phase angles of the signals from each electrode is calculated for each time-frequency data point and for every trial. The single-trial phase differences, represented by complex numbers, are then averaged across trials. The absolute value of this complex average yields the magnitude value that defines the phase coherence. Phase coherence=0 indicates completely uniform random phase angle differences across trials, whereas phase coherence=1 indicates completely consistent phase angle differences across trials. This pairwise phase coherence is sometimes calculated for every unique electrode-pair combination available in a given scalp electrode montage then averaged across all pairs, yielding a single, 2-dimensional phase coherence matrix summarizing the overall pair-wise phase coherence across the entire electrode montage.56 Such an overall phase coherence measure is sometimes accompanied by a scalp map of the EEG electrodes with lines connecting all electrode pairs whose coherence has exceeded some statistical significance threshold within a specific frequency band and time window.
Particularly because there are now studies documenting schizophrenia-related deficits in cross-trial consistency of phase differences across recording sites using phase coherence56,57 and cross-trial consistency of phase within recording sites using PLF,13,58–61 it is important to emphasize the complete dissociability and complementarity of these 2 measures. High phase coherence suggests that the difference in phase angles between signals from sites A and B are consistent from trial to trial, but the actual phase angles across trials from site A (or site B) need not show high cross-trial consistency, ie, the PLF within an electrode site may be as small as zero. The difference between phase coherence and PLF measures can be made clear if we consider the location of the hour and minute hands on a clock to represent the phase angles (ranging from −180° to 180°) from 2 electrode sites, respectively, for a specific time-frequency data point on a single trial. Consider 4 unique “trials” represented by the times 12:15, 3:30, 6:45, and 9:00. The phase angle difference between the hour and minute hands across these 4 trials is always 90°. Therefore, the phase coherence across these trials equals one, indicating perfectly synchronous phase differences between the sites across trials. However, the PLF for the hour hand and the PLF for the minute hand is essentially zero because the phase angles for the individual hands are inconsistent across trials. Now consider another set of 4 trials represented by the times 12:14, 12:15, 12:16, and 12:17 on our hypothetical phase angle “clock”. Again, the phase angle difference between the hour and minute hands across these new trials is very close to 90°, which would produce a phase coherence that is very close to one. This set would also produce a PLF very close to one for the hour hand (site A) and the minute hand (site B) because their individual phases are much more consistent across trials than in the first example. Thus, phase coherence and PLF are completely dissociable measures that reflect very different types of phase synchrony.
One caution that applies equally to PLF and to phase coherence is the importance of having similar numbers of trials contributing to calculations when comparing 2 conditions. PLF and phase coherence are sensitive to the number of trials included, particularly when the trial number is small. In the extreme case, the PLF or phase coherence involving only one trial yields a value=1, and experimental conditions containing fewer trials will generally have higher PLF or phase coherence than conditions containing more trials.
Just as phase coherence calculates phase consistency between channels across trials, there are algorithms to calculate “pure” cross-channel consistency of EEG magnitudes (or magnitude squared, ie, power) across trials. Because magnitude values can be obtained from complex numbers (as described in the total power section above), one approach used to estimate “magnitude-only” cross-channel consistency is to extract these real magnitude values, at a specific frequency and time point in the event-related epoch, for each electrode pair across all trials, and then to correlate them using the Pearson correlation. However, it has been argued that EEG magnitudes can vary slowly (ie, over minutes) across trials and brain regions as a subject's state of arousal drifts over time, attenuating the magnitude correlation across trials. Accordingly, one alternative measure, the amplitude envelope correlation (AEC),62 focuses, for each frequency of interest, on the correlation of the EEG magnitudes contained within a short time window (ie, the amplitude envelope) within an epoch. This time window is moved across the epoch to provide time-specific correlation values. These correlation values are normalized using Fisher's r-to-z transformation and then averaged across trials. Unlike MS coherence, AEC values can be very high even when phase differences are randomly distributed, and 2 signals have low MS coherence. This also means that AEC and phase coherence measures provide independent complementary measures of the synchrony of magnitude and phase, respectively, between 2 recording sites.
EEG signals are transmitted from sources in the brain to the scalp surface in ways that reflect the geometry of the cortical folds and the orientation of the dipoles generated by changes in excitatory and inhibitory postsynaptic potentials within neuronal assemblies. These electrical currents are volume conducted through the brain's tissue and spatially smeared as they are conducted through the cerebrospinal fluid, dura mater, skull, and scalp. The result is that EEG activity seen at one scalp electrode does not necessarily reflect the activity of the directly underlying cortex; rather, it reflects the summated activity from multiple sources volume conducted across variable distances to reach the electrode. Just as importantly, EEG activity from a particular source is transmitted to multiple electrodes across the scalp, depending on the orientation and strength of the dipoles comprising the source activity. Consequently, spatial coherence analyses between electrode sites can be confounded by shared activity from a third source, creating essentially spurious coherence between sites. While this may be less problematic, the further apart electrodes are in space, a source near the rostral vertex of the cortex with a transverse anterior to posterior dipole orientation with respect to the scalp surface can transmit its signals to both the anterior and posterior scalp sites without being evident in the electrodes directly over the source itself. Thus, even electrodes that are relatively far apart can largely reflect activity from the same underlying source. Related to the “shared source” problem is the fact that the electrical activity captured by EEG electrodes reflects voltage differences with respect to a common reference electrode or set of reference electrodes. Accordingly, the activity in the reference electrode can also introduce spurious coherence between EEG channels.14,63,64 This is not a problem for MEG recordings, which provides reference-free measures of magnetic fields.
To address the problem of single source contamination of multiple sensors via volume conduction, some have advocated for methods that reduce the spatial smoothing and smearing of scalp EEG activity, such as Laplacian transforms used to generate current source density (CSD) waveforms.50,65 These CSD waveforms reflect what is unique to each electrode while minimizing activity that is broadly distributed across multiple electrodes. Another approach involves isolating statistically independent waveforms from the scalp EEG data using independent component analysis, followed by assessment of MS coherence and/or phase coherence between independent components.10 Because of the contamination of electrodes or sensors by volume-conducted signals, the absolute value of phase coherence between sites is probably less meaningful than a relative comparison of phase coherence between 2 experimental conditions. However, even here, caution is required because when a third source is active in one experimental condition, but not another, it can create the spurious impression that one condition is associated with increased coherence between sites, relative to the second condition.
Distinct from methods that attempt to isolate more localized or independent activity within the array of scalp sensors are methods that attempt to isolate the underlying neural sources of scalp-recorded activity. These source methods examine spatial synchronization of oscillations between brain sources, in “source space” rather than in “sensor space.”50,66,67 This requires source modeling to solve the so-called inverse problem, modeling the location and number of sources that give rise to the EEG distributions across the scalp surface. This problem is “ill posed” in that multiple source model solutions can give rise to the same scalp data; the problem cannot be uniquely solved but requires the investigator to make assumptions about the number and general location of sources or to use approaches such as minimum-norm methods68 that allocate EEG activity to thousands of small sources across the entire cortex and subcortical brain regions. Phase coherence between source waveforms is implemented in software packages such as BESA.69,70 Some have even argued that modeling source activity does not eliminate the possibility that 2 sources are spuriously coherent because they are contaminated by activity from a third source.51 Accordingly, other alternatives have been proposed such as examination of coherence between only the imaginary component of the complex data estimated for each source51, deriving a coherency measure that is insensitive to phase-coherent oscillations involving signals with a zero phase lag between them. Zero phase lag between phase-coherent signals is presumed to reflect activity from the same source because communication between sources takes time, resulting in a phase lag between the synchronous oscillations. Of note, studies of spatial phase coherence in schizophrenia published to date57,71 have been based on analyses of spatially untransformed EEG data in scalp sensor space, not source space, and therefore, results are potentially influenced by spurious phase coherence arising from contamination of the scalp sensors by activity from the same underlying source(s).
The measures described so far characterize the synchronization of phase and power with respect to an eliciting event or between 2 spatially segregated signals but always within the same frequency or frequency band. In contrast, there is another class of measures, broadly referred to as cross-frequency “coupling” of EEG power or phase. Coupling describes synchronous activity between magnitude and/or phase components of data in 2 different frequencies. It has been defined as a cross-frequency relationship between 2 distinct frequencies in the continuous, recorded signal.72 Coupling measures deviate from all previous measures discussed above because they examine relationships between different frequencies, time locked to features of the phase or magnitude of one of the frequencies. Theta-gamma coupling is described in detail by Lisman and Buszaki.4
Cross-frequency coupling can be estimated for phase-magnitude, phase-phase, or magnitude-magnitude data from pairs of frequencies. Beyond studying theta-gamma coupling, a computational problem quickly arises if there is no predetermined notion of the coupling type and specific frequencies to examine. Looking for all 3 types of coupling between 50 measured frequencies at 32 electrode sites quickly leads to over 100000 comparisons for a single time sample. Thus, the question of where to look for coupling is an important consideration.
One MEG study of phase-phase coupling (called “n:m phase synchronization”) restricted the number of comparisons by examining relationships between phases of a base frequency and its harmonic frequencies (ie, n=5, m=10 or any other multiple of 5) only.72 The study attempted to link the phases of a coordinated motor behavior (hand tremor in a patient with Parkinson disease) with the phases of coordinated cortical activity by designating the frequency of the tremor (n=5–7 Hz) as the base frequency and then searching for phase coupling in the MEG data from sensorimotor and premotor areas specifically in the harmonic frequencies of interest (m=10–14 Hz).
Another proposed method74 to deal with the computational complexity of coupling analysis is to examine the coupling between the phase of total power fluctuations of a high-frequency band and the phase of a low frequency EEG oscillation over the same time window. This is accomplished by performing an FFT on the time series of total power values for a higher frequency band of interest and then designating the peak frequency from the FFT as the lower frequency of interest to be extracted over the same time period from the EEG. The phase of the low frequency EEG oscillation is then examined for coupling with the phase of the higher frequency's power oscillation.
Measures of coupling and procedures to define comparison frequencies are certainly not limited to those mentioned in this section, and further investigations may help to reduce the frequency selection problem of coupling analysis.
In all the event-related time-frequency measures described above, the question arises as to whether, or how, to take pre-event baseline activity into consideration. Because our interest is in capturing event-related changes in brain activity, baseline correction is generally implemented to adjust post-stimulus values for values present in the baseline period. Some measures, like PLF, or phase coherence, may not have significantly large values in the pre-event baseline period, and therefore, baseline correction may negligibly change the results. When baseline values are large and/or variable across trials, detection of event-related changes will generally benefit from baseline correction. However, there are a variety of approaches to performing baseline correction and some important considerations in choosing a baseline when analyzing time-frequency data. It is important to understand that the choices made in implementing baseline correction can influence the results of analyses and that differences in the baseline correction procedure may be one reason for inconsistent results across studies.
In general, for each frequency in a time-frequency matrix resulting from any of the methods described above, a baseline period is defined by the average of the values within a time window preceding the time-locking event. There are at least 4 common methods for baseline correction in time-frequency analyses. One method involves a simple subtraction of baseline values from all the values in the epoch.58 This is the most common approach to baseline correction of ERP data. A second method involves dividing the baseline-subtracted values by the baseline, producing a “percent change from baseline” value70. A third related method involves dividing the value at each time point in the epoch is by the baseline value and then taking the log10 transform of this quotient and multiplying it by 20, yielding values expressed in units of decibels (dB)10 (figure 2F). A fourth method involves subtracting the baseline from each value then dividing this difference by the SD of the values contained in the baseline period, yielding baseline-adjusted z scores.27,49,54,72 Methods 2 through 4 have the advantage of removing overall scale differences between frequencies and between individuals, rendering them more directly comparable. Method 4, which expresses the deviations from baseline in SD units that take into account the variability in the baseline period, may be particularly sensitive to small changes, relative to a simple unstandardized baseline subtraction method.
Concerning baseline correction, whether by division (as in the dB scale transform) or by subtraction, it is important to take into consideration the length of the temporal window used in time-frequency decomposition (eg, STFT or wavelet window). Because the complex value estimated at any given center time point within the time window is influenced by all of the points encompassed by the window, half of the time points influencing the complex value estimate at an event's onset (time=0 milliseconds) actually follow its onset, whereas the other half precede it. Other complex values at time points preceding an event's onset are influenced by post-onset data up until the time point corresponding to one-half the length of the STFT or wavelet window. In order to minimize the temporal smearing of post-onset activity into the baseline, some47 recommend that the baseline period should end no closer to the event's onset than one-half the length of the wavelet or STFT window. However, inasmuch as the sliding temporal windows used in most spectral decomposition methods are weighted by tapered envelopes (eg, Gaussian, Hanning, Hamming, triangular), minimizing the influence of data points furthest from the window's center time point, studies commonly extend the baseline period all the way to an event's onset without serious consequence.
Perhaps more important than the endpoint of a baseline period relative to an event's onset is the proximity of the baseline's starting point relative to the beginning of the EEG epoch. A baseline period should begin in the epoch after at least one-half the length of the temporal window used in time-frequency decomposition in order to avoid “edge effects” (ie, distortions resulting from convolution of the temporal windowing function with a data time series that does not extend over the full length of the temporal window).47,75 A similar consideration applies in choosing the last post-event time point analyzed relative to the end of the epoch.47
Another important consideration in defining pre-event baselines is that the duration of the baseline itself should be influenced by the EEG frequency being analyzed. Slower frequencies will benefit from longer baselines in order to capture a reasonably stable period of baseline activity. Although there are no widely accepted rules or conventions about this, if one were to adopt the convention that the baseline duration should be at least long enough to capture a full cycle of the frequency of interest, then the baseline duration for a 4 Hz frequency, for example, would need to be at least 250 milliseconds.
Amplifier and filter settings for EEG data acquisition must attend to some important considerations in order to record data suitable for time-frequency analysis. The EEG signal must be sampled at a fast enough rate to avoid frequency aliasing of the signal. Aliasing is the misrepresentation of a high-frequency signal as a lower frequency signal due to temporal undersampling. The minimum sampling rate needed to avoid aliasing, known as the Nyquist rate, is twice as fast as the highest frequency of interest, although most EEG acquisition software imposes an even higher standard such as a sampling rate that is 4 times the highest frequency of interest. In addition, if data are acquired with a bandpass filter setting, it is important to set the low-pass filter above the highest frequency of interest so that oscillation frequencies of interest are not removed from the data. This issue comes up because low-pass filters were often set to cut-offs between 30 and 50 Hz for traditional ERP paradigms in order to eliminate 60 Hz (50 Hz) line noise from the acquired EEG data. While this worked well for ERP analysis, it usually precluded any meaningful time-frequency analysis.
To further support the concepts described above, a time-frequency analysis of EEG data from an auditory oddball paradigm is presented below. The purpose of this presentation is 2-fold: First, we conduct a Morlet wavelet analysis on the data from healthy control subjects in order to provide a detailed explanation of the wavelet procedure and to illustrate the impact of different parameter choices on the resulting spectral decomposition of the EEG data. Our points are illustrated by focusing on the well-characterized gamma response30,34 evoked by auditory stimuli, in this case by the standard tones, occurring about 50 milliseconds post-stimulus onset. Second, the PLF values that quantify the phase locking of this evoked gamma response are compared in healthy controls and patients with schizophrenia. The few published reports that have examined this auditory evoked gamma response in schizophrenia patients have generally not found significant abnormalities,34,71,76 despite evidence that schizophrenia patients show reduced auditory evoked gamma responses to 40-Hz click trains during auditory steady-state driving paradigms58,77–79 Nonetheless, the dependence of gamma oscillations on neurotransmitter systems and circuits implicated in the pathophysiology of schizophrenia3–6 led us to hypothesize that the phase locking of the 50-millisecond poststimulus gamma response to auditory stimuli would be reduced in patients with schizophrenia. Details of the task, subject samples, ERPs, and time-frequency analyses from this study are described in more detail elsewhere,13 although the analysis of the 50-millisecond evoked auditory gamma response to standard tones was not previously reported.
The healthy control group (HC; n=22) comprised 13 men and 9 women recruited by newspaper advertisement from the community. The HC had no prior history of a major Axis I psychiatric disorder, including alcohol or drug abuse, based on the screening questions from the Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV).80 In addition, the HC had no history of psychotic disorders in first-degree relatives. The majority of the HC group was right-handed (right-handed, n=19; left-handed, n=2; ambidextrous, n=1).
The schizophrenia patient group (SZ; n=21) comprised 4 women and 17 men recruited from our local outpatient clinics. All patients met DSM-IV criteria for schizophrenia (paranoid subtype, n=18; undifferentiated subtype, n=3) based on a Structured Clinical Interview for DSM-IV, and all were on stable doses of antipsychotic medication (atypical agent, n=17; typical agent, n=1; both atypical and typical agent, n=3). The majority of the SZ group was right-handed (right-handed, n=20; left-handed n=1). Patients were excluded if they met criteria for DSM-IV alcohol or drug abuse within the 30 days prior to study participation.
The groups were matched on age (mean ± SD: HC=37.3 ± 12.6 years, SZ=39.2 ± 10.4, difference not significant) and parental socioeconomic status (mean ± SD: HC=34.6 ± 15.2, SZ=38.3 ± 19.7, difference not significant), but the HC group had significantly (P < .0001) more years of education than the SZ group (HC = 16.2 ± 2.8 years, SZ = 13.3 ± 2.1 years). Exclusion criteria for both groups included any significant history of head injury, neurological disorders, or other medical illnesses compromising the central nervous system. The study was approved by the institutional review boards of Yale University School of Medicine and the VA Connecticut Healthcare system, and all subjects provided written informed consent prior to being enrolled in the study.
Subjects were presented with a pseudorandom sequence of 210 (P=.70) standard tones (500 Hz, 10-millisecond rise and fall time, 50-millisecond duration), 45 (P=.15) novel sounds, and 45 (P=.15) target tones, played at 80-dB SPL via headphones with an intertrial interval of 1250 milliseconds. Subjects responded to target tones with a button press.
EEG data were continuously recorded from 26 sites, referenced to linked earlobes, although only the data from electrode Fz are presented for the demonstration of parameter influences on the wavelet analysis. For the group comparisons, electrodes F3, Fz, F4, C3, Cz, and C4 were analyzed. EEG data were digitized at a rate of 1 kHz with a 0.05- to 100-Hz bandpass filter. No additional filters were applied to the data offline. EEG associated with standard tone trials were segmented into epochs spanning 500 milliseconds before the tone onset to 600 milliseconds after it. Trials where the subject responded within 1200 milliseconds of the standard tone were considered false alarm errors and discarded.
EEG trial epochs were corrected for eye movements and blinks based on the vertical and horizontal electrooculogram channels using a regression approach.81 Epochs were then baseline corrected by subtracting the −100 to 0 milliseconds prestimulus baseline from all data points in the epoch. Finally, trials containing artifacts exceeding ±100 μV were discarded, and the number discarded did not differ significantly (P=.23) between the groups (mean ± SD trials surviving artifact rejection, HC=202 ± 14 trials, SZ=196 ± 18 trials).
Standard tone EEG epochs were analyzed with a complex Morlet wavelet decomposition38,41 using freely distributed FieldTrip (http://www.ru.nl/fcdonders/fieldtrip/) software in Matlab. This is by no means the only approach to wavelet analysis or to time-frequency decomposition of EEG. However, we focus on this method in order to provide a more detailed example of how different parameter choices influence the results of time-frequency analyses.
The Morlet wavelet transform is defined by setting parameters for the general “mother wavelet,” which is then used to generate the family of wavelets covering the frequencies to be extracted during the spectral decomposition of EEG data. The Morlet wavelet is a complex wavelet, comprising real and imaginary sinusoidal oscillations, that is convolved with a Gaussian envelope so that the wavelet magnitude is largest at its center and tapered toward its edges (see figure 4). The wavelet's Gaussian distribution around its center time point has a SD of σt. The wavelet also has a Gaussian shaped spectral bandwidth around its center frequency, f0, that has a SD of σf. The temporal SD, σt, is inversely proportional to σf (ie, σt ~ 1/σf), consistent with the Heisenberg uncertainty principle described earlier that as temporal precision increases (ie, shorter σt) frequency precision decreases (ie, larger σf). The exact relationship between them is defined by the formula σt =1/(2πσf). Furthermore, a wavelet is defined by a constant ratio of the center frequency, f0, to σf (ie, f0/σf = c), such that σf and σt vary with the center frequency, f0. This constant, c, is typically recommended to be greater than 5,21 and is often set to values of 6 or 7, which corresponds to a σt that encompasses at least one full sinusoidal cycle for any particular frequency. In addition to setting the value for this constant, the investigator must also specify a factor, m, that, when multiplied by the product of σt and the center frequency, f0, defines the number of cycles to be included in the mother wavelet (number of cycles=mσtf0). Often 6 cycles are recommended,47 but fewer cycles such as 413,61 or 238 have been used to the benefit of temporal resolution but at the expense of frequency resolution. Thus, the temporal window of a Morlet wavelet for any given frequency is mσt, and the spectral bandwidth around any given center frequency is mσf. Morlet wavelets are usually normalized to have a total energy equal to 1 for each frequency prior to convolution with the EEG data, allowing direct comparisons between the magnitude values output for different frequencies.38,47 In order for the Morlet-derived magnitude values to be directly related to the raw voltage values for each frequency, a different normalization factor is required.47
We applied a Morlet wavelet analysis to the standard tone EEG epochs to examine the evoked gamma (~40 Hz) response to the tone in the 50–100 millisecond post-tone onset time range. In order to focus on the gamma band and its neighboring frequencies, we limited the frequency range examined to 20–60 Hz. We also limited the time period of interest to a range beginning at −150 milliseconds prestimulus to 200 milliseconds poststimulus. The length of the EEG epochs (−500 to 600 milliseconds) encompassed time points beyond the period of interest in order to provide the wavelet at the lowest frequency examined (20 Hz) sufficient time samples at the edges of the period of interest (−150 and 200 milliseconds) to calculate the complex data for the wavelet's center point via convolution. Because our focus was on the evoked (ie, phase synchronized) gamma response at 50 milliseconds post-stimulus, we extracted the phase angles from the complex data to estimate the cross-trial phase consistency (PLF) for all frequencies and time points in our time-frequency matrix. Our purpose in this set of analyses is to demonstrate the impact of different Morlet wavelet parameter choices on the resulting PLF values describing the same raw data. We varied 2 parameters, c and m, repeating the Morlet transform for 6 different c and m combinations. The constant, c, was set to values of 7 or 14. Based on the formulas provided above, the doubling of c doubles the temporal size of the SD of the Gaussian time envelope, σt, that shapes the wavelet. The multiplication factor, m, which determines the number of these σt SDs encompassed by the wavelet, was set to values of 2, 4, or 6. Note that the number of cycles contained within the mother wavelet increase as either one of these parameters increase.
The PLF plots resulting from the wavelet analyses using the 6 c and m parameter combinations are presented in figure 4, along with the associated Morlet wavelet used at the 40-Hz frequency bin. Although the plots depict the group mean PLFs (averaged over 22 subjects), we also present for purposes of illustration the overlays of 210 EEG trial epochs from one subject in the center of the figure. As can be seen in the figure, at either value of the constant, c, as m increases from 2 to 4 to 6, the wavelet expands in its temporal width, encompassing additional cycles within the tapering tails of the wavelet's Gaussian temporal envelope. Of note, when c = 7, the value of m closely corresponds to the number of cycles, such that the wavelet contains 2.23 cycles when m = 2 and 6.68 cycles when m = 6. When c is doubled to 14, the number of cycles contained in the wavelet at each level of m is also doubled, leading to 4.46 cycles when m = 2 and 13.37 cycles when m = 6. The PLF plots show that at the smallest temporal width of the 40-Hz wavelet, the PLF values in the gamma band are blurred across a broad frequency range, from 20 to nearly 60 Hz, but with relatively tight temporal specificity showing essentially 2 temporal bands, one centered at about 50 milliseconds and one at about 75 milliseconds (figure 4, top left PLF plot). This temporal specificity blurs into a single burst of gamma phase locking spanning between 40 and 80 milliseconds, but with a much narrower gamma range centered on about 42 Hz and spreading between 36 and 48 Hz (top right PLF Plot). For any given value of the multiplication factor, m, the increase in the constant, c, from 7 to 14 is associated with a wavelet with a stretched Gaussian envelope that encompasses twice the number of cycles and a more slowly declining taper toward the wavelet's edges. As a result, doubling the constant broadens the temporal smearing while also tightening the gamma frequency range showing enhanced PLF, as is most evident in the bottom right PLF plot. These figures are consistent with the estimates of the temporal windows (mσt) and spectral bandwidths (mσf) associated with each of the 6 parameter combinations (see table 1).
Which of these time-frequency decompositions best capture the “true” nature of the gamma synchronization evoked by auditory tones? The answer, of course, is none of them. Each is as “true” as the other but reflects investigator judgments about the best compromise to strike between time resolution and frequency resolution. The fact that gamma PLF cannot be simultaneously pinpointed in time and in frequency, but instead is more precise in one dimension at the expense of precision in the other dimension, has been related to the well-known Heisenberg uncertainty principle derived from quantum physics.25
For the group comparison, we used the PLF values based on a wavelet constant c=7 and multiplication factor of m=4. The time-frequency plots of the PLF values for the HC and SZ groups are presented in figure 5. The evoked gamma response was most evident in PLF values of the HC group between 20 and 60 milliseconds in a frequency band of 35–50 Hz, as shown in figure 5. Accordingly, the PLF values in this time-frequency window were averaged in each group and analyzed in a group (HC vs SZ) × frontal-central (F3, Fz, F4 vs C3, Cz, C4) × laterality (F3, C3 vs Fz, Cz vs F4, C4) repeated-measures analysis of variance. The results showed a significant overall reduction of the gamma PLF in SZ relative to HC (F1,41=5.01, P=.031), as well as significantly larger gamma PLF in frontal relative to central electrodes (F1,41=29.91, P < .0001) and in midline relative to off-midline electrodes (F1,41 = 19.93, P < .0001). There was also a trend for SZ to exhibit greater reductions than HC in midline relative to lateral electrodes (F1,41=3.07, P=.057, Greenhouse-Geisser adjusted). No other interaction effects were significant.
These results suggest that chronic medicated patients with schizophrenia exhibit deficient phase synchronization of the frontally distributed gamma oscillation evoked by an auditory standard tone in an oddball target detection task. While it is not clear why we observed this difference when others examining the same auditory evoked gamma response did not,34,71,76 we speculate that task differences may account for the discrepant results. In particular, our standard tones were imbedded in a 3-stimulus oddball paradigm involving both infrequent task-relevant target tones and infrequent task-irrelevant novel sounds. The presence of novel distractors may have heightened our task's attentional demands, relative to simpler 2-tone oddball tasks, unmasking the schizophrenia deficit in the evoked gamma response. Of note, the auditory gamma response evoked around 50 milliseconds following auditory stimuli has been shown to be modulated by top-down attentional control processes,35,36 consistent with the idea that the reduced gamma phase locking in the SZ group in our study may have arisen from task-related deficits in attentional state, relative to the HC group. Further work is needed to clarify under what conditions this early-evoked gamma response is reduced or intact in patients with schizophrenia.
There is growing recognition, from both basic and systems neuroscience, that the brain organizes and coordinates the information it processes through synchronized oscillatory activity among and between neuronal assemblies. This recognition has breathed new life into the relatively old technology of EEG recorded from the scalp. Furthermore, the development of mathematical algorithms and ready access to computational hardware and software that easily implements these algorithms has set the stage for a new era of EEG-based data analyses that are poised to elucidate the role of frequency-specific neuronal oscillations and their synchronization in brain functions ranging from simple sensory processing to higher order cognition. A natural extension of these methods to neuropathological conditions provides new leverage for understanding the pathophysiology of complex neuropsychiatric disorders such as schizophrenia. This is a timely development in that other aspects of the clinical neuroscience of schizophrenia increasingly point to disruptions in connectivity and coordination among brain regions, processes that depend on synchronized neuronal oscillations. Furthermore, schizophrenia is associated with compromise of neuronal elements that subserve these oscillations, such as abnormalities in parvalbumin-expressing γ-aminobutyric acidergic interneurons2 in N-methyl-D-aspartate glutamate receptors.82 As a result, there is a growing literature using EEG (and MEG) to study abnormal brain dynamics, synchronization, and connectivity in schizophrenia. Accordingly, in order for the schizophrenia neuroscience community to be able to synthesize the results from these studies, a wider segment of this community will need to develop a basic understanding of the methods being used for spectral decomposition of EEG, the dependence of results on the parameter settings chosen, and the variation across studies in how the concept of neural synchrony is addressed.
Toward this end, we have provided a basic overview of spectral decomposition methods and neural power and synchrony measures, most of which have already been implemented in recent event-related EEG studies of schizophrenia. These methods and measures, and the names assigned to them, can be a source of confusion in the research literature. All the measures make use of the magnitude and/or phase angle information derived from the complex data extracted from the EEG during spectral decomposition. Some measures estimate the magnitude or phase consistency of the EEG within one channel across trials, whereas others (sometimes with similar names) estimate the consistency of the magnitude or phase differences between channels across trials. Beyond these 2 families of calculations, there are also measures that examine the coupling between frequencies, within trials and within recording sites. Of course, in the realm of time-frequency analysis, many types of relationships can be examined beyond those already mentioned, and new measures are still being created and explored.
This work was supported by the Department of Veterans Affairs and grants from the National Institute of Mental Health (MH058262) and the National Alliance for Research in Schizophrenia and Affective Disorders.
The authors thank the two reviewers for their helpful comments.