|Home | About | Journals | Submit | Contact Us | Français|
New efforts to develop treatments for cognitive dysfunction in mental illnesses would benefit enormously from biomarkers that provide sensitive and reliable measures of the neural events underlying cognition. Here we evaluate the promise of event-related potentials (ERPs) as biomarkers of cognitive dysfunction in schizophrenia. We conclude that ERPs have several desirable properties: (a) they provide a direct measure of electrical activity during neurotransmission; (b) their high temporal resolutions makes it possible to measure neural synchrony and oscillations; (c) they are relatively inexpensive and convenient to record; (d) animal models are readily available for several ERP components; (e) decades of research has established the sensitivity and reliability of ERP measures in psychiatric illnesses; and (f) feasibility of large N (>500) multi-site studies has been demonstrated for key measures. Consequently, ERPs may be useful for identifying endophenotypes and defining treatment targets, for evaluating new compounds in animals and in humans, and for identifying individuals who are good candidates for early interventions or for specific treatments. However, several challenges must be overcome before ERPs gain widespread use as biomarkers in schizophrenia research, and we make several recommendations for the research that is necessary to develop and validate ERP-based biomarkers that can have a real impact on treatment development.
In October, 2009, the CNTRICS initiative (Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia) convened a meeting to discuss the development of biomarkers of cognitive dysfunction in schizophrenia. This paper reflects the consensus of the subgroup of participants who focused on event-related potentials (ERPs) and related signals. The paper evaluates the strengths and weaknesses of ERPs as biomarkers so that researchers and sponsors can make informed decisions about future research directions. It also provides a roadmap of the methodological advances and validation studies that are necessary for ERPs to serve as broadly useful biomarkers.
ERPs are voltage fluctuations in the EEG that are time-locked to internal or external events (e.g., stimuli, responses, decisions). ERPs are usually isolated from the ongoing EEG by a signal-averaging procedure, revealing a sequence of positive and negative peaks that are related to the event of interest. It is also possible to use frequency-domain techniques to examine event-related oscillatory activity that is invisible in conventional ERPs. Although we will focus mainly on conventional ERPs, oscillations are generated by similar mechanisms and are subject to similar methodological considerations, so most of our discussion therefore applies to event-related oscillations as well.
In almost all cases, ERPs originate from postsynaptic potentials (PSPs) in cortical pyramidal cells, arising as a consequence of the flow of ions across the cell membrane in response to neurotransmitters binding with receptors. When PSPs occur simultaneously in similarly oriented neurons, the resulting field potentials summate, and the voltage can be detected instantaneously on the scalp. Thus, ERPs provide a direct, millisecond-resolution measure of neurotransmission-related neural activity.
ERPs can be measured at the scalp only when PSPs are simultaneously produced in many thousands of similarly oriented neurons. Consequently, PSPs occurring in interneurons and in nonlaminar structures cannot typically be measured in scalp ERP recordings (although their synaptic effects on cortical pyramidal cells can be detected). Moreover, conventional ERP recordings are primarily sensitive to transient changes in neural activity over periods of tens or hundreds of milliseconds and are relatively insensitive to tonic changes that last for several seconds or more. Thus, although many neural processes can be measured with ERPs, not all types of brain activity produce an ERP signature on the scalp.
In basic science studies (and some clinical studies), it can be challenging to isolate specific ERP components, which are summed together in the scalp recordings. The ERP generated in a single brain area will spread broadly as it travels to the scalp, contributing to the measured voltage over the entire scalp. The contribution of a given component to a specific electrode site depends on the location and orientation of the generator source with respect to the electrode site (along with the conductivities and geometries of the different tissue compartments in the head), so different components are picked up more strongly at some sites than at others. However, when multiple components are active simultaneously, they all contribute (to some extent) to the voltage recorded at every electrode site.
To address this challenge, several procedures have been developed to estimate the internal distribution of neural activity on the basis of the observed scalp potential distribution (1-4). Because these procedures require the addition of constraining assumptions to produce a unique solution, converging evidence is usually necessary to unambiguously localize an ERP generator. However, once this has been accomplished in basic science research, the results can be used in subsequent clinical research. In addition, more straightforward techniques such as current source density transformations can be used to de-blur scalp ERP signals (5). Thus, the relatively poor spatial resolution of the ERP technique is not a major impediment to the use of ERPs as biomarkers in clinical research.
The skull and scalp are transparent to magnetism, and the blurring of the voltage fields by these tissues can be avoided almost completely by using magnetoencephalography (MEG) to record the magnetic counterpart to the ERP signal (6). Under many conditions, MEG localization is more precise than ERP localization, and localization can be even more precise when MEG and EEG are combined (7-9).
To isolate specific ERP components, it is often useful to construct difference waves, in which the waveform in one condition is subtracted from the waveform in another condition. If the two conditions vary primarily in the presence or strength of a single component, then the difference wave will isolate this component (10). Multivariate statistical approaches can also help dissociate ERP components (11, 12).
Figure 1 shows how difference waves have been used to isolate the mismatch negativity (MMN). If a sequence of tones is presented with 80% of the tones being 1000-Hz standards and 20% of the tones being deviants of a different pitch, the deviant-minus-standard difference wave will contain a negative component peaking between 100 and 200 ms. Several studies have provided evidence that the neural activity isolated by means of this difference wave can be localized to the auditory cortex (13, 14) and that it may, in part, reflect the flow of current through NMDA receptor-mediated ion channels (15-20) (but see 21).
It is useful to compare ERPs with the blood oxygen level-dependent (BOLD) signal in functional magnetic resonance imaging (fMRI). Whereas ERPs directly reflect PSPs and may be tied to specific transmitter systems, the BOLD signal measures the hemodynamic response and is correlated with both local field potentials and spikes (22, 23), which may make it difficult to link the BOLD signal to a specific transmitter system. The BOLD signal reflects the integrated neural response over periods of seconds, and it will therefore tend to be more sensitive to long-duration changes in neural activity than to the brief, transient changes in neural activity that are primarily reflected by ERPs (24). In addition, because ERPs are direct measures of neural activity, any effects of a treatment on the ERP must reflect changes in neural activity, whereas a given treatment could influence the magnitude or timing of the BOLD response without changing the underlying neural activity (via changes in the vasculature or in the control of the hemodynamic response). However, fMRI provides a more accurate spatial image of the neural activity, providing a closer link to anatomically defined neural circuits. Moreover, ERPs cannot be observed unless a given neural process leads to synchronized PSPs in cortical pyramidal neurons, whereas a broader set of neural processes will lead to changes in the BOLD signal. Thus, ERPs are more tightly linked to specific aspects of neural activity, but the BOLD signal may be useful in measuring a broader set of neural responses and in determining the anatomical site of those responses. With respect to preclinical studies, a major advantage for ERP measures is the availability of animal models of ERP paradigms that demonstrate high sensitivity to pharmacological manipulations, genetic modifications, and lesions relevant to schizophrenia (20, 25, 26). We are unaware of any large-scale cross species validation studies of fMRI paradigms utilized in schizophrenia.
ERPs, unlike hemodynamic measures, have a temporal resolution in the millisecond range. This is particularly advantageous in the measurement of neural synchrony and oscillations. Schizophrenia, for example, may be associated with a pervasive disturbance of synchronized activity among neural ensembles (27-29). Synchrony and oscillatory activity is theorized to be critical for a wide range of perceptual, motor and cognitive processes (30). Disturbances in synchrony may therefore contribute to the disorganized thinking and behavior associated with this disorder. Oscillations can be measured non-invasively in patients, through intracranial recordings in rodent models, and in cellular recordings in slice preparations (28, 31, 32). Consequently, translation from cellular mechanisms to human electrophysiology is highly feasible.
ERPs may be useful as biomarkers in various aspects of psychiatric treatment research. First, ERPs may serve as endophenotype markers that are closer to the underlying biology of the disease than are clinical symptoms, and they are relatively convenient to use in large-N genetic studies. The feasibility of utilizing ERPs in multi-site, large N studies (> 500) has been established by past (33) and ongoing functional genomic projects (34).
Second, ERPs can be used in preclinical research to define potential treatment targets. That is, if a given ERP component measures the operation of a given cognitive process, neural circuit, or transmitter-receptor system, then an abnormality of this component suggests that treatments targeting that process, circuit, or system might produce therapeutic benefits.
Third, homologues of human ERPs can be found in both rodents and primates (25, 35), providing opportunities for across-species translational research. If, for example, an ERP abnormality can be induced in a rodent homologue that matches the abnormality observed in patients, then the rodent model may be used as an assay of the likely effectiveness of novel compounds.
Fourth, ERPs can be used in human clinical research to determine whether a given treatment influences the specific process, circuit, or transmitter system of interest. ERPs can also be used in healthy volunteers in whom the ERP abnormality has been transiently induced by a pharmacological challenge (36).
Fifth, ERPs might be useful in defining subgroups within a disorder. Imagine, for example, that only 20% of patients with a given disorder exhibit a specific ERP impairment. A new treatment that normalizes this ERP component might be effective primarily in this subgroup of patients. Clinical trials that include only these patients would therefore be much more likely to detect an effect of this treatment, dramatically increasing the cost effectiveness of the trials.
Note that ERPs are already widely used in the diagnosis of certain neurological disorders (e.g., multiple sclerosis) and sensory disorders (e.g., screening of neonates for hearing impairments). Thus, it is realistic to expect that ERPs could be used as biomarkers in the identification of genes related to mental illness, in efforts to identify individuals at high risk for developing disorders prior to illness onset, and in the development and assessment of new treatments. P300 abnormalities, for example, have been demonstrated in individuals at high risk for transition to schizophrenia or other psychotic disorders (37). The next section will consider in detail the steps that must be taken for ERPs to become truly useful biomarkers.
ERPs that correlate with symptom status across time could be used as markers of disease state and to assess efficacy of new treatments. ERPs that are stable over time—independent of changes in symptoms—could be used as markers of stable traits that are associated with responsiveness to specific treatments or with genetic liability.
Current evidence about correlations with symptom status suggests that some components are more trait-like and others are more state-like. For example, although P50 gating (a suppression of the P50 component by a preceding conditioning stimulus) is reliably reduced in schizophrenia patients (38), there is not yet clear and consistent evidence that P50 gating varies with symptoms (39). In contrast, P300 amplitude has been shown to correlate with both schizophrenia subtype and a variety of symptoms (40). It should be noted, however, that correlations with gross clinical measures of positive or negative symptoms may not be as important as correlations with specific cognitive, affective, and social functions that are the targets of new treatment development efforts.
A biomarker will be particularly useful in the drug discovery process if it can predict which compounds are likely to be effective. For example, given that the MMN is reduced in schizophrenia patients, do medications that normalize the MMN also lead to a reduction in functional impairment? This would be particularly useful when combined with an animal model, because novel compounds could be screened in animals to determine whether they are likely to be effective in humans. Although ERPs have the potential to provide such information, we currently have little information about whether any ERP effects in animals can actually predict the efficacy of new treatments in humans. This is a key area for future research.
Biomarkers can also be useful if they predict individual differences in treatment response. If, for example, only a subset of schizophrenia patients show a reduction in a given component, certain treatments might be beneficial primarily for that subset of patients. Given the tremendous heterogeneity that characterizes schizophrenia, this kind of predictive ability could be enormously valuable. ERP measures have been used only infrequently as biomarkers for treatment outcomes or predictive measures for treatment response, but initial studies have been encouraging. With respect to schizophrenia, large MMN amplitudes predicted good treatment response to clozapine, while large P3a amplitudes predicted poor treatment response (41). In mood disorders, SSRI antidepressants have been shown to be less effective in patients with an enlarged error-related negativity and more effective in patients showing reduced intensity dependence of the auditory N1-P2 complex (reviewed in 42). Studies of other neuropsychiatric disorders suggest that ERP measures predict transition from coma to more responsive levels of consciousness (43), prediction of resumption of drinking in sober alcoholics (44), response to atomoxetine for ADHD treatment (45), and response to modafinil treatment for narcolepsy (46).
There is increasing interest in identifying individuals at risk for psychiatric illness prior to the onset of the illness so that early interventions can be attempted. ERPs could potentially serve as biomarkers of risk for psychosis that would prove useful in identifying the prodromal individuals who stand to benefit most from early treatment. For example, a number of reports have shown that individuals with prodromal symptoms indicative of high risk for schizophrenia already have reduced P300 amplitude (37, 47-49) and MMN amplitude (50, 51) before the onset of schizophrenia. These studies, as well as a large-scale eight-site prospective longitudinal study (52), are recording ERPs to determine whether ERP abnormalities can help predict which prodromal individuals will actually convert to schizophrenia.
Good measurement properties (e.g., reliability, validity, sensitivity, selectivity) are important for ERPs to be used in clinical trials and in clinical diagnosis/screening. ERP reliability is determined primarily by two classes of factors. The first is the signal-to-noise ratio of the ERP waveform, which is influenced by several factors including the number of trials averaged together, the amount of induced electrical noise from the recording environment, and the incidence of non-neural bioelectric signals such as skin potentials (53). The second major class of factors that influence ERP reliability are those that cause uncontrolled variations in the actual neural activity, such as arousal level, time since last meal, phase of circadian cycle, etc. (54). Quite a lot is known about how to optimize these factors, but we lack formal, standardized, widely adopted standards for measuring and evaluating the signal-to-noise ratio.
The reliability of ERP measures varies across paradigms. For example, a study of the oddball paradigm with testing episodes separated by 4 to 1061 days found excellent test-retest reliability of the P300 component in both patients (correlations of .82 to .87) and control subjects (.86 to .93) (55, see also 56, 57). In contrast, most test-retest studies of P50 gating have reported low correlations (well under 0.5) (reviewed by 58). Also, the reliability of ERP measures will depend on how well a given laboratory optimizes the signal-to-noise ratio (59) and how well factors such as time of day are controlled. A recent review of fMRI reliability similarly found that fMRI reliability varies according to factors such as the task, subject population, and recording methods, with a range that is comparable to that observed for ERPs (60).
The selectivity of a given ERP effect for a specific diagnostic group can also be important. However, it is important to keep in mind that a given cognitive-neural-pharmacological system may be impaired in only a subset of individuals within a given diagnostic group, and it may be impaired across multiple diagnostic groups. Thus, a biomarker that is highly sensitive to and selective for a given underlying system may not selectively distinguish among our current diagnostic categories.
ERPs are widely used to document differences between groups of individuals (e.g., between patients and healthy controls). To be broadly useful as biomarkers, however, ERPs must be able to characterize individuals within a group, and this has been done only rarely. As illustrated in Figure 2, ERP waveforms often differ markedly across individuals. This sort of variability can be either good or bad, depending on the source of variability. The variability is good if it reflects true variation in the operation of a given cognitive, neural, or pharmacological system. The variability is bad if it reflects measurement error (i.e., noisy data) or differences among individuals in factors that are unrelated to the system of interest.
Individual differences in ERP amplitudes may partly result from differences in nuisance variables such as cortical folding and skull thickness. As discussed earlier, the voltage measured at a given electrode site will depend on the position and orientation of the generator, along with the geometry and conductivities of the different tissue compartments (e.g., brain, skull, scalp). The orientation of an ERP generator might vary considerably across individuals as a result of differences in cortical folding patterns, leading in some cases to partial cancellation when the generator extends to both sides of a sulcus (61). In addition, differences due to skull thickness and resistance can be considerable (62). For example, P300 amplitude declines by approximately 1 μV for each additional mm of skull thickness (63). However, relatively little is known about whether these nuisance factors explain a meaningful percentage of the variance in ERP amplitudes across individuals. If they are a major source of individual differences, it is possible that more sophisticated measures of ERP amplitudes could be used to minimize their impact. For example, it may be possible to use source localization procedures to provide a better estimate of the strength of the underlying generator source. Additional research is necessary to develop ERP quantification techniques that are sensitive to individual differences in the underlying cognitive-neural-pharmacological processes but are insensitive to nuisance variables.
Animal homologues can be particularly useful in the early stages of drug discovery, where they can be used as a first step in screening new compounds. ERPs can be recorded readily in animals, but demonstrating true homologies is nontrivial, especially between humans and rodents. For example, the P3b subcomponent of the P300 complex in humans is observed only when subjects actively discriminate between stimulus categories (64), but active discrimination paradigms have been used in some animal homologue studies (e.g., 65) but not others (e.g., 66). Good homologues of several cognitive ERP components have been established in monkeys (25, 35). In addition, ERP components that are elicited automatically in the absence of a specific task have been shown to have homologues in rodents, including P50 gating (26), and the MMN (20).
For ERPs to be useful as a screening or outcome measure in clinical trials or for guiding individualized treatment in clinical practice, it is necessary to have rigorous and standardized quality assurance protocols. This is especially important for large-scale, multi-site clinical trials. Remarkably, there are no widely used formal standards for assessing the quality of ERP data. Professional societies periodically establish and revise lists of recording standards and publication criteria (67, 68). However, these standards are minimal and primarily focus on data acquisition parameters rather than an assessment of the actual quality of the data. For example, these standards specify desirable electrode materials, electrode impedances, analog-to-digital conversion precision, sampling rate, etc., but they do not provide clear, quantitative, and widely applicable standards for signal-to-noise ratios, biological artifact contamination levels, component quantification accuracy, or split-half reliability. As a result, it is difficult to determine whether the data collected at a given site have met minimal standards for data quality or whether differences in the results across sites or across studies reflect differences in data quality. This is a key area for future development.
A potential biomarker is more likely to receive widespread use if it is inexpensive and easy to implement. Depending on factors such as the number of channels and the need for extensive shielding, an EEG recording system can be acquired for as little as $10,000 and as much as $250,000 (with a typical range of $30,000-$100,000). Significant training is required for the technician who records the EEG data, but the technician need not have even a bachelor's degree. Data processing and analysis requires a higher level of education and training, but conventional ERP analyses are fairly straightforward and do not require an advanced degree. EEG recordings are routine in hospitals and clinics, and special facilities are not normally necessary (unless electrical noise is a problem). Thus, although the costs of ERPs are higher than the costs of simple neuropsychological testing, they are substantially lower than the costs of MEG and MRI.
ERPs are easily tolerated by most patients. Preparation time is usually relatively brief, unless the study requires both large numbers of electrodes and low electrode impedances. The duration of the actual recording depends on the nature of the component and paradigm being used, but is typically between 20 and 90 minutes. ERPs are already used in clinical practice to assess sensory responses, and it therefore seems reasonable to expect that they could be adapted for the assessment of cognitive processing. Thus, cost and implementation are not major impediments to the widespread adoption of ERPs as biomarkers.
ERPs have great potential as biomarkers that could be used to improve the development of treatments for schizophrenia and other psychiatric disorders. They can be readily adapted to clinical contexts, and they could be used in clinical trials and genetic studies that require very large sample sizes. At least some ERP paradigms lead to highly reliable measures. ERPs provide a direct measure of neural activity, and some ERP components have been linked with specific transmitter-receptor systems. ERPs can be recorded easily in other species, and good homologues of several human ERP components have been developed in species ranging from mice to monkeys.
However, there are several issues that must be addressed before ERPs can live up to their promise and become widely used as biomarkers. In our view, the most urgent issues are as follows:
We believe that these issues can be addressed within a reasonable time frame if sufficient effort is made, leading to valuable ERP biomarkers that could speed the development of new treatments.
Preparation of this article was made possible by NIH grants R01MH076226 (S.J.L.), R01MH065034 (S.J.L.), R01MH076989 (D.H.M.), R01MH082022 (D.H.M.), RO1MH62150 (B.F.O), P41-RR14075 (M.S.H.), R01EB009048 (M.S.H.), R01EB006385 (M.S.H.), R01MH080187 (K.M.S.), and R37MH49334 (D.C.J.), as well as by IUSM/CTR, NIH/NCRR Grant Number RR025761 (B.F.O.). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH.
Financial Disclosures: Daniel H. Mathalon reports receiving research funding from NARSAD and Astra-Zeneca. Kevin M. Spencer reports receiving research funding from Galenea Inc. Daniel C. Javitt reports holding equity Glytech, Inc., serving on the advisory board of Promentis, receiving research funding from Pfizer, Roche, and Jazz, and performing consulting for Schering-Plough, Pfizer, AstraZeneca, NPS Pharmaceuticals, Sanofi, Solvay, Takeda, Sepracor, Lundbeck, Cypress, and Merck. All other authors report no biomedical financial interests or potential conflicts of interest.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Steven J. Luck, University of California, Davis.
Daniel H. Mathalon, University of California, San Francisco.
Brian F. O'Donnell, Indiana University.
Matti S. Hämäläinen, Harvard Medical School.
Kevin M. Spencer, VA Boston Healthcare System and Harvard Medical School.
Daniel C. Javitt, Nathan Kline Institute for Psychiatric Research.
Peter J. Uhlhaas, Max Planck Institute for Brain Research.