|Home | About | Journals | Submit | Contact Us | Français|
Author contributions: D.I.L. and D.C.J. designed research; D.I.L. performed research; P.L. and D.C.J. analyzed data; P.L., C.E.S., D.I.L., and D.C.J. wrote the paper.
Recent neuroscience advances suggest that when interacting with our environment, along with previous experience, we use contextual cues and regularities to form predictions that guide our perceptions and actions. The goal of such active “predictive sensing” is to selectively enhance the processing and representation of behaviorally relevant information in an efficient manner. Since a hallmark of schizophrenia is impaired information selection, we tested whether this deficiency stems from dysfunctional predictive sensing by measuring the degree to which neuronal activity predicts relevant events. In healthy subjects, we established that these mechanisms are engaged in an effort-dependent manner and that, based on a correspondence between human scalp and intracranial nonhuman primate recordings, their main role is a predictive suppression of excitability in task-irrelevant regions. In contrast, schizophrenia patients displayed a reduced alignment of neuronal activity to attended stimuli, which correlated with their behavioral performance deficits and clinical symptoms. These results support the relevance of predictive sensing for normal and aberrant brain function, and highlight the importance of neuronal mechanisms that mold internal ongoing neuronal activity to model key features of the external environment.
Our brains do not have the capacity to continuously process all information that is constantly bombarding our sense organs. One of the tactics the brain has evolved to deal with this surfeit of information is to predictively amplify sensory inputs carrying relevant information while suppressing inputs that do not (Nobre et al., 2007; Lakatos et al., 2008; Schroeder et al., 2010; Friston, 2012). Such predictive sensing is an active and dynamic process that exploits regularities in both the timing and content of sensory events. In the case when temporal features, such as the rhythm of attended stimuli, are predictable, the brain can anchor the cyclic patterns of its internal excitability fluctuations (neuronal oscillations) to the timing of attended stimuli to amplify inputs related to these stimuli (Large and Jones, 1999; Lakatos et al., 2008; Saleh et al., 2010). Aligning the internal neurophysiological context to the external context of relevant stimuli allows the brain to optimize precious processing resources and increase not only processing accuracy, but also processing efficiency.
This internal-to-external alignment that is the means of active predictive sensing can be guided by the temporal structure of motor-initiated rhythmic events such as saccades in vision, sniffing in olfaction, and “whisking” in somatosensation, or simply by the temporal structure of attended stimuli, most characteristically in audition (Schroeder et al., 2010). Across brain systems, predictive sensing results in temporal windows of high excitability characterized by increased beta/gamma (>15 Hz) and multiunit activity (MUA) within local neuronal ensembles at times when attended stimuli are expected to occur (Lakatos et al., 2008; Saleh et al., 2010). Such “excitability windows” can also be two dimensional in that they are selective for both time and feature. For example, in a recent nonhuman primate study we demonstrated that when both the timing and pitch of auditory stimuli were predictable, high excitability was restricted both in time and across differently tuned neuronal ensembles in cortical space (Lakatos et al., 2013). Along with serving as a spectrotemporal filter mechanism of auditory attention, the confinement of high-excitability windows in time and space in the brain is an efficient strategy for minimizing the energy expenditure needed to maintain a near-threshold (high-excitability) cortical state (Buzsáki and Draguhn, 2004; Kann, 2011). The fact that in many neuropsychiatric disorders, such as schizophrenia, patients are not only less accurate in processing information, but also significantly less efficient (Potkin et al., 2009; Nicodemus et al., 2010), might thus indicate a failure of predictive-sensing mechanisms. The main goal of our study was to characterize auditory active predictive processes in an adaptive auditory discrimination task and the consequences of their failure in schizophrenia patients, a pathological population with well described auditory-processing deficits and cognitive disturbances.
Informed consent was obtained from 40 schizophrenia patients (30 male and 10 female) and 20 healthy control subjects (10 male and 10 female). Patients were diagnosed according to Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV), criteria using the Structured Clinical Interview for DSM-IV (SCID) interview and/or a combination of interview, chart review, and consultation with clinical staff. Control subjects were recruited via local advertisement and screened in accordance with SCID, Nonpatient Edition, criteria. All patients were receiving antipsychotic medication (e.g., haloperidol, haloperidol decanoate, olanzapine, risperidone, quetiapine, ziprasidone, and clozapine either alone or in combination), with mean equivalent doses of the antipsychotic drug capsazepine of 1093.6 ± 484.2 mg/d. Symptoms were assessed using the Brief Psychiatric Rating Scale (BPRC) of Overall and Gorham (1962).
For the present study, we analyzed the electrophysiological data recorded during five dual-multielectrode penetrations of primary auditory cortex (A1) of the auditory cortex of a female macaque (Macaca mulatta) who had been prepared surgically for chronic awake electrophysiological recordings. Before surgery, the subject was adapted to a custom-fitted primate chair and to the recording chamber. All procedures were approved in advance by the Animal Care and Use Committee of the Nathan Kline Institute. Preparation of the subject for chronic awake intracortical recording was performed using aseptic techniques, under general anesthesia, as described previously (Schroeder et al., 1998). The tissue overlying the calvarium was resected, and appropriate portions of the cranium were removed. The neocortex and overlying dura were left intact. To provide access to the brain, plastic recording chambers (Crist Instruments) were positioned normal to the cortical surface of the superior temporal plane for orthogonal penetration of area A1, as determined by preimplant MRI. Together with socketed Plexiglas bars (to permit painless head restraint), the recording chambers were secured to the skull with orthopedic screws and embedded in dental acrylic. A recovery time of 6 weeks was allowed before we began data collection.
All subjects participated in the following tasks: (1) a “passive” paradigm where subjects were instructed to ignore the presented auditory stimuli (pure tones: duration, 50 ms; rise/fall, 5 ms; loudness, 75 dB SPL) while watching a silent video; (2) an “easy” condition, in which subjects were instructed to press a button when frequency-deviant tones (targets) occurred in a stream of standards [in this condition, deviants differed by 50% in pitch from the repetitive standards (1500 vs 1000 Hz), and almost all subjects performed at ceiling levels (>95% level)]; and (3) a “difficult” condition, in which the frequency of deviant tones was dynamically adjusted from 1020 (2% change) to 1600 (60% change) in logarithmic steps based upon a 3-down/1-up transform rule, in which difficulty was increased following three successive correct target detections and was decreased following one failed detection or false alarm (Leitman et al., 2010). This provided a mean correct performance of 78.7% for all subjects. In all three conditions, stimuli were presented rhythmically every 1500 ms, and the deviant probability was 20%.
The task the monkey had to perform was similar to the one performed by human subjects. We presented rhythmic streams of pure-tone beeps at 40 dB SPL (25 ms duration, 5 ms rise/fall time) with a constant stimulus-onset asynchrony (SOA) of 624.5 ms. The rhythmic tone stream consisted of standard, frequently repeating tones whose frequency was set to one of two values, which were determined based on the best frequency (BF) of the recording site: one of the frequency values corresponded to the BF, while the other was either 2 octaves higher (if the BF of the site was ≤8 kHz) or lower (if the site's BF was >8 kHz). Frequency deviants (2–4 semitone difference from the standards) randomly occurred in the stream of standard tones every 3–9 s (10% probability). In the beginning of behavioral training, a 0.25–1 ml juice reward was delivered to the subject simultaneously with each deviant through a tube. The tube was positioned such that the monkey had to stick out her tongue to get the juice. Licking was monitored using a simple contact detector circuit, the output of which was continuously recorded with Labview together with the timing of standard and deviant tones for off-line analyses. In this phase of training, the frequency difference between the standard and deviant tones was about 1 octave. After several (10–20) training sessions, the juice reward was omitted on 20% of the deviants. The subject responded to juiceless deviants 92% correct using a frequency difference of 2 semitones for target tones compared with standards (Lakatos et al., 2013).
Electrical recordings were obtained in a dark, sound-attenuated chamber from 64 scalp locations, consisting of expanded 10/20 placements, along with monopolar vertical and horizontal EOG electrodes using an Active II recording system (Biosemi), relative to nose reference. Activity was amplified with a bandpass of 0.001–100 Hz and digitized continuously at a sampling rate of 500 Hz.
Animals sat in a primate chair in a dark, isolated, electrically shielded, sound-attenuated chamber with head fixed in position, and were monitored with infrared cameras. Laminar profiles of field potentials (EEG) and concomitant population action potentials (MUA) were obtained using linear array multicontact electrodes (23 contacts, 100 μm intercontact spacing). Multielectrodes were inserted acutely through guide tube grid inserts, lowered through the dura into the brain, and positioned such that the electrode channels would span all layers of the cortex, which was determined by inspecting the laminar response profile to binaural broadband noise bursts. In this position, the uppermost channel was situated 100–300 μm above the surface of auditory cortex. Neuroelectric signals were impedance matched with a preamplifier (10× gain; bandpass DC, 10 kHz), and after further amplification (500×) the signal was split into field potential (0.1–500 Hz) and MUA (300–5000 Hz) ranges by analog filtering. Field potentials were sampled at 2 kHz per 16 bit precision, MUA was sampled at 20 kHz per 12 bit precision. Additional zero phase shift digital filtering (300–5000 Hz) and rectification were applied to the MUA data to extract the continuous estimate of cell firing. At the beginning of each experimental session, after refining the electrode position in the neocortex, we established the BF of the recording site using a “suprathreshold” method (Steinschneider et al., 1995). The method entails presentation of a stimulus train consisting of 100 random order occurrences of a broadband noise burst and pure tone stimuli with frequencies ranging from 353.5 Hz to 32 kHz in half-octave steps (duration, 100 ms; rise/fall time, 5 ms; SOA = 624.5). Auditory stimuli were produced using Tucker-Davis Technology System III coupled with MF-1 free field speakers.
Human surface and monkey intracranially recorded data were analyzed similarly, using Matlab. To enable the analysis of low-frequency (down to 0.5 Hz) oscillations in the time period between and around auditory stimuli, 10-s-long epochs (5 s prestimulus and 5 s poststimulus) were constructed from the continuous EEG recording. In the case of human data, the artifact rejection threshold was set to ±100 μV, and, due to more movement-related artifacts in the auditory tasks, we ended up with a different number of trials across conditions and subject groups. In monkey data, we rejected epochs where the summed amplitude of field potential recorded on all electrode channels exceeded ±3 SDs. Following automatic rejection, we visually inspected the remainder of the unfiltered trials for artifacts and excluded on average 5–10 further epochs. In the present study, we only analyzed responses to standard tones. To avoid the confounding effects of the P3 component and motor responses, we only included standards that were two trials following and two trials preceding the deviant tone (on average, 185.6, 170.4, and 117.6 trials for controls in passive, easy, and difficult conditions, and 153, 138.9, and 101.7 for patients in the same conditions). In averaged human auditory event-related potentials (ERPs), N1 was defined as the peak negativity within the 60–160 ms latency range (see Fig. 2A).
For the purpose of analyzing neuronal oscillations, instantaneous oscillatory amplitudes and phases were extracted by wavelet decomposition on 121 scales from 0.5 to 101 Hz (Morlet wavelet). To assess the degree of phase locking (the similarity of oscillatory phase across trials) to auditory stimuli in the data, the wavelet-transformed single-trial data were normalized (unit vectors) before averaging the trials, and the length (modulus) of the resulting vector was computed (Lakatos et al., 2008). The value of the mean resultant length, or intertrial coherence (ITC), ranges from 0 to 1; higher values indicate that single-trial oscillatory phases are clustered more closely around the mean phase than lower values. Since ITC values are dependent on the number of trials (Maris et al., 2007), we performed an analysis to determine to what extent our ITC results might be biased by unequal sample sizes (Fig. 1). We binned all delta phase values within controls and patients in each of the three conditions (passive easy and difficult), ending up with six groups of phases. Next, we performed 1000 random draws of trials in the 50–250 trial number range and calculated the ITC. Finally, we compared ITC values calculated from phases in the six phase groups for trial numbers that corresponded to our average lowest number of trials (101.7) and highest number of trials (185.6). We found that while in controls ITC values in the passive condition were the most biased (ITC ratio for lowest/highest trial count in passive, easy, and difficult conditions: 1.31, 1.05, and 1.01), in patients ITC values were biased similarly in all three conditions (1.24, 1.23, and 1.25). Our data indicate that the enhanced bias in the patient group compared with the controls during the performance of the auditory tasks is likely a result of the fact that ITC bias is strongly dependent on the uniformity of the phase distributions. We will discuss the implications of the ITC biases in the Results section. To extract the time course of beta/gamma range activity, we averaged oscillatory amplitudes in the 15–50 Hz range in the case of humans, and the 25–50 Hz frequency range in monkeys. To calculate the attended stimulus structure-related beta/gamma modulation index, we subtracted the interstimulus beta/gamma amplitudes (humans: −900 to −700 ms; monkeys: −400 to −300 ms) from the peristimulus beta/gamma amplitudes (humans: −100 to 100 ms; monkeys: −150 to −50 ms). In the case of intracortically recorded data, MUA modulation index was calculated the same way. Statistical analyses were performed using repeated-measures multivariate ANOVA.
Auditory ERPs were collected from 20 healthy controls and 40 schizophrenia patients in three conditions: a passive condition, in which subjects were instructed to ignore auditory stimuli while viewing a silent movie, and an easy and a difficult task condition (see Materials and Methods), in which subjects were required to detect frequency deviants. In all conditions, stimuli were presented rhythmically with a SOA of 1500 ms.
Across conditions, the main auditory ERP measure of interest was a prominent N1/P2 component that was maximal frontocentrally (Fig. 2A). The amplitudes of the N1 component were similar in all three conditions and, as documented extensively by earlier studies (Javitt et al., 2008), were significantly reduced in schizophrenia patients versus controls. The most prevalent difference among conditions was actually observed in the “baseline” of the auditory ERPs: during the interstimulus interval, a large-amplitude, rhythmic delta frequency component emerged in the auditory task conditions, corresponding in wavelength to the SOA (Fig. 2B). This baseline fluctuation was evident in both active task conditions in control subjects and was most pronounced in the difficult condition. In contrast, the baseline appeared flat in all conditions in the recordings obtained from schizophrenia patients.
To determine the basis of the task-related baseline fluctuation, both ITC at the presentation rate (0.67 Hz) and single-trial delta oscillatory amplitudes were computed. As the averaged responses foreshadowed, our ITC analysis indicated robustly higher ITC with increasing engagement in the auditory task in controls (F(2,18) = 68.4, p < 0.00001), robustly lower delta coherence in patients overall, and a highly significant group × condition interaction (group: F(2,57) = 58.6, p < 0.0001; interaction: F(2,57) = 31.5, p < 0.00001). The interaction reflected a main effect of condition in controls with significant increases in ITC in the difficult task condition relative to both the passive and the easy task conditions (all p < 0.00001). In contrast, patients showed only a marginal alteration in delta ITC across conditions (F(2,38) = 3.07, p = 0.058). As a result, while delta ITC was similar across groups in the passive condition (t = 0.65, df = 58, p = 0.5), it was significantly different in the difficult condition (t = 7.37, df = 58, p < 0.00001).
Although, as Figure 1 illustrates, ITC values are biased by the number of trials, two observations confirm that this bias does not influence our main effects: first, the number of trials in all three conditions is greater in controls, which should result in smaller ITC values, “working against” the significant ITC difference between-subject groups observed in the data; and second, although the number of trials is lowest in the difficult condition in both groups, which might contribute to the small increase in ITC in the case of patients, the size of this effect (1% increase; see Materials and Methods) cannot explain the large engagement-dependent ITC increase in controls (233% increase in the difficult compared with the passive condition). Notably, while there is a clear behavioral condition–ITC bias interaction in the control group (Fig. 1), this interaction is absent in patients, which likely reflects the fact that all three conditions had similar (and rather uniform) phase distributions in this subject group.
Distribution of the delta ITC was maximal frontocentrally (Fig. 2B, insets), similar in distribution to that of the auditory N1 potential (Fig. 2A), and thus consistent with generators within supratemporal auditory cortex, although contributions of additional brain regions cannot be excluded. An additional indication for auditory involvement is that the observed deficit in delta ITC in schizophrenia patients correlates well with their behavioral deficits, which we describe in detail later in the Results. Significant delta ITC was also observed over left and right central regions, consistent with the potential additional involvement of motor cortical areas (Gerloff et al., 1998).
In contrast to delta ITC, which reflects the similarity of delta phase across trials, single-trial delta amplitude [measured both centered on the 0.67 Hz presentation rate (Fig. 2C, boxplots) and in the “classic” 0.5–3 Hz delta band] was not significantly different across conditions (F(1,58) = 2.11, p = 0.15 and F(1,58) = 0.26, p = 0.6), and there was no significant group × condition interaction (F(2,57) = 0.28, p = 0.76 and F(2,57) = 0.04, p = 0.96). These results indicate that while ongoing delta oscillations are present in the EEG in all three conditions, their alignment to the auditory stimulus structure (context) evolves from little or none in the passive condition to higher levels as task demands increase in the easy and difficult task conditions. Therefore, rather than being a result of de novo generated neuronal activity, the most likely explanation for the rhythmic baseline fluctuation observable in the ERP in control subjects is task-dependent entrainment of ongoing neuronal oscillations. Our results also show that increased task demand in schizophrenia patients does not coincide with increases in oscillatory entrainment that would enable the alignment of endogenous oscillatory activity to the predictable cadence of relevant stimuli.
Recent findings in nonhuman primates indicate that the mechanism by which entrained delta frequency oscillations influence stimulus detection and behavioral responding is by predictively modulating excitability in local neuronal ensembles (Lakatos et al., 2008, 2013; Saleh et al., 2010). The modulatory effects associated with delta entrainment are evidenced by a phase-dependent, task structure-bound, rhythmic modulation of net neuronal beta/gamma band high-frequency (HF) activity and ensemble firing (MUA).
Based on the results of previous studies (Fries et al., 2001; Lakatos et al., 2008), the amplitudes of gamma frequency (>25 Hz) neuronal activity and MUA appear to be intimately connected, in that they are modulated in the same way by attention and they concurrently signal changes in net neuronal excitability of a neuronal ensemble. Although there is rapidly mounting evidence that beta oscillations can also be linked to some aspects of attentional processes (Buschman and Miller, 2007, 2009; Kay et al., 2009; Saleh et al., 2010; van Ede et al., 2010; Kilavik et al., 2012), their functional role appears to be different from that of gamma in characteristic and sometimes opposing ways (Fries et al., 2001; Engel and Fries, 2010; Bastos et al., 2012; Siegel et al., 2012). Recent studies suggest that it might play a larger role in feedback functional connectivity, as opposed to gamma, which has been suggested to play a key role in feedforward communication between hierarchically organized nodes of stimulus processing (Fries, 2005; Buschman and Miller, 2007; Bosman et al., 2012). In the motor cortex, where 15–30 Hz beta activity predominates high-frequency neuronal activity both in humans and nonhuman primates (Saleh et al., 2010; Kilavik et al., 2012), several studies provide clear evidence that its amplitude is predictively modulated by the relevant sensory-motor temporal context (Sanes and Donoghue, 1993; Murthy and Fetz, 1996; Donoghue et al., 1998; Saleh et al., 2010; Bidet-Caulet et al., 2012; Kilavik et al., 2012). Thus, while the precise role of high-frequency (and for that matter low-frequency) neuronal activity is still being debated, it is clear that the amplitudes of both gamma and beta oscillatory activity can be predictively modulated during attentive sensorimotor behavior.
Consistent with these prior findings, in healthy subjects HF (15–50 Hz) activity displays significantly more modulation in the active task compared with the passive condition and in the difficult versus easy task (Fig. 2D). By contrast, patients appeared to display sustained levels of beta/gamma band activity across both passive and active conditions, with little alteration between conditions.
To quantify HF activity modulation, we calculated the amplitude difference between peristimulus (−100 to 100 ms) and interstimulus (−900 −700 ms) beta/gamma (15–50 Hz) activity (“modulation index”). We found that this modulation index was similar across subject groups in the passive condition and was not significantly different from 0 in either group (t = 0.39, df = 58, p = 0.6). As with delta ITC, we also found a significant group × condition interaction (F(2,57) = 12.3, p < 0.0001), reflecting a significant task difficulty-dependent modulation of HF activity in the control group, but no significant task structure-related HF modulation in patients (F(2,18) = 14.5, p < 0.0001 vs F(2,38) = 1.18, p = 0.3).
In controls, we observed significantly larger modulation in both active conditions compared with passive stimulation, and there was also a significantly greater modulation in the difficult versus the easy condition (all p < 0.01). In stark contrast, task-related HF activity modulations were not detectable in patients. As with delta ITC, topographical distributions of the beta/gamma modulation index (Fig. 2D) are consistent with mainly auditory and motor origins in control subjects. Specific frontocentral localization was observed in the 25–36 Hz frequency range (Fig. 3), which corresponds well to the frequency of dominant gamma band oscillatory activity in nonhuman primate auditory cortex (Lakatos et al., 2005, 2009), supporting auditory cortical generation. In contrast, slower and faster beta/gamma range activity mapped respectively to sensorimotor and posterior regions, suggesting somewhat different underlying processes.
The contingent-negative variation (CNV) is a slow brain potential that can be observed in tasks such as the present one that require a motor response, and that is closely associated with motor preparation (Brunia and Boxtel, 2001). A hallmark of the CNV is that it increases greatly in amplitude as the probability of a motor response increases (Stadler et al., 2006; Ford et al., 2010). Although auditory oddball paradigms, such as the one used here, are not designed a priori to isolate CNV-like activity, it has nevertheless been recently reported that CNVs may occur preceding stimuli with an imminent response, and that CNV amplitude is reduced in schizophrenia (Ford et al., 2010; Dias et al., 2011).
We therefore also analyzed our low-frequency effects as a function of deviant stimulus and thus motor response probability in both the difficult and easy conditions to isolate the potential contribution of CNV to our observed between-subject group differences in delta ITC (Fig. 4). Visual inspection of the ERP related to sequentially presented standard tones following a deviant suggests relatively constant delta activity across all stimulus positions (Fig. 4A), with a slight increase in amplitude with increasing deviant probability. This is also signaled by a slight stimulus position-related increase in the amplitude of prestimulus (−50 to 0 ms) negativity (Fig. 4B). Nonetheless, neither the mean effect of difficulty (F = 1.10, df, 1,18, p = 0.31) or position (F = 0.71, df = 4,15, p = 0.6) was significant (repeated-measures ANOVA). Similarly, neither the main effect of standard position (F = 1.81, df = 4,15, p = 0.18) nor the difficulty by position interaction (F = 1.33, df = 4,15, p = 0.31) were significant on delta ITC. Furthermore, neither the linear (p = 0.4) nor quadratic (p = 0.8) trends for the position analysis were independently significant. Finally, when delta ITC at positions 4/5 was compared with activity at prior positions, no significant difference was observed in either the difficult (p = 0.11) or easy (p = 0.78) conditions. Additionally, we found a highly significant main effect of task difficulty across all positions (F = 46.2, df = 1,18, p < 0.00001), which was significant at each standard position considered independently (all p ≤ 0.001). Together, these data support a significant, task difficulty-dependent entrainment effect across conditions, but provide no evidence that the magnitude of delta ITC varies as a function of stimulus position, as would be expected from a CNV contribution (Ford et al., 2010). It should be noted that even for the fifth stimulus position following a deviant in the present study (Fig. 4, S5), conditional deviant probability was substantially lower than for terminal deviants in Ford et al., 2010, potentially accounting for the different CNV findings between the two studies.
In addition to the attention-related modulations of delta and high-frequency activity, we observed a large difference in the amplitude of theta/alpha frequency (5–12 Hz) oscillatory activity both across conditions and between groups (Fig. 2C). The mainly theta frequency (~7 Hz) ongoing activity observable in the passive condition was significantly larger in patients (see also Hanslmayr et al., 2012) and showed a frontocentral maximum in topographic maps consistent with at least partly auditory generators (Fig. 5A–C). In contrast, topographical maps of the attention-related increase in alpha band activity (~9 Hz) indicate loci primarily over posterior sites (Fig. 5D). Alpha oscillations have been thought to reflect “idling” of the visual system in the case when no visual input is present (Adrian and Matthews, 1934), and were also implicated in regulating excitability of visual areas via suppressing the activity of task-irrelevant cortical regions (Ray and Cole, 1985; Neuper and Pfurtscheller, 2001; Klimesch et al., 2007; Haegens et al., 2012). Unlike in the passive condition, during the performance of the auditory tasks subjects were not required to watch a silent movie, thus both idling and active suppression might contribute to the alpha increase observed in this condition. Importantly, the increase in posterior alpha activity was similar across groups (Fig. 5E). Thus, to the degree that increased posterior activity reflects engagement in the auditory task, the two subject groups displayed similar levels of task engagement.
Together, these findings indicate that while controls are able to entrain their ongoing oscillatory activity to attended task structure, patients do not efficiently modulate the temporal structure of their ongoing oscillatory activity even in a demanding task. Interestingly, while prior studies suggest that the role of entrainment is to provide a high excitability temporal window around the time when attended stimuli are predicted to occur (Lakatos et al., 2008; Mathewson et al., 2010; Saleh et al., 2010; Stefanics et al., 2010), the rhythmic modulation of HF activity in active control subjects resulted in a predictive reduction measured at scalp electrodes in our data (Fig. 2D), indicating suppressed excitability. To resolve this issue, we analyzed data recorded in a monkey directly from auditory cortex, since in these recordings the firing of the local neuronal ensemble (MUA) provides a direct measure of excitability changes. The monkey was performing a target detection task similar to that of human subjects. Neuroelectric activity was obtained simultaneously with two linear array multicontact electrodes that were positioned 2 mm along the tonotopic axis of area A1, permitting concurrent sampling at sites tuned to the attended frequency content (BF sites) and tuned on average 2 octaves away (non-BF sites). In addition to intracortical MUA, we analyzed local field potentials recorded directly above BF and non-BF sites.
As in humans, the baseline activity of both sites fluctuated at a wavelength corresponding to the SOA of auditory stimuli (Fig. 6A). Of note, however, field potentials recorded just above A1 reflected the opposite phase entrainment of supragranular cortical neuronal oscillations in BF versus non-BF regions (Lakatos et al., 2013). Gamma activity also fluctuated with opposite phase: in BF regions, gamma was enhanced immediately before stimulus onset, whereas an opposite effect was observed in non-BF regions (Fig. 6B); both effects were significant (p < 0.01). Analysis of the simultaneously recorded MUA within the underlying regions of auditory cortex revealed that gamma modulation was also associated with corresponding alterations in local neuronal ensemble firing (Fig. 6B,C). Critically, these results show that high-frequency activity and MUA are suppressed around the peristimulus relative to interstimulus timeframe in non-BF regions, indicating that the main effect of stimulus predictability was a suppression of excitability at the time of stimulation in these neuronal ensembles. A comparison of mean delta phases at stimulus onset measured in scalp and intracranially recorded data reveals that the phase of entrained delta oscillatory activity in controls corresponds to the phase of entrained delta oscillations above non-BF regions (Fig. 6D).
A critical caveat in interpreting scalp-recorded data is that with narrow-band stimuli like pure tones, the size of the A1 region processing attended stimuli (i.e., the BF region) is extremely small relative to the extent of the rest of A1. The latter represents a large non-BF region whose net response might be predictively attenuated by attention, as a recent study indicates (Lakatos et al., 2013). Therefore, while intracranial recordings can capture increases in gamma activity associated with increased neuronal excitability in cortical regions processing attended stimulus features (present results; Flinker et al., 2010; Lakatos et al., 2013), our data indicate that at the macroscopic level of scalp recordings, predictive cortical modulation can manifest as a net decrease in activity at attended time points, reflecting the predictive net suppression of neuronal activity in regions processing nonattended (interfering) stimulus features. This could also explain the absent or relatively modest alterations in surface N1 typically observed in active versus passive auditory paradigms (Fig. 2A), and the attenuation of N1 that has been reported when both the timing and pitch of auditory stimuli are predictable versus random (Lange, 2009).
To determine the functional consequences of oscillatory entrainment deficits in schizophrenia, we conducted follow-up correlational analyses. Within the patient group, reductions in delta ITC in the difficult condition correlated strongly with their ability to detect frequency differences (r = −0.55, p = 0.001; Fig. 7A), which indicates that that predictive stimulus processing aids significantly in frequency discrimination. Delta ITC also correlated highly with deficits in the generation of P3 to the target stimuli in patients in both the easy condition (r = 0.62, p < 0.001) and difficult condition (r = 0.35, p = 0.036; Fig. 7B). This suggests either that a failure of low-level active predictive sensory processing leads to significant deficits in higher-level auditory response or that delta entrainment and P3 generation share a common mechanism (phase reset), which is impaired in patients and thus underlies both deficits independently. By contrast, correlations were not significant in controls considered independently (all p > 0.05), suggesting that, in controls, levels of delta modulation were not rate limiting for performance.
To determine the degree to which deficits in delta generation contributed to tone-matching and P3 deficits in patients versus controls, ANCOVA analyses were conducted with delta ITC variables as covariate. Once ITC was entered as a covariate into between-group analyses, the between-group differences in tone-matching (F(1,47) = 0.01, p = 0.9) and P3 generation (F(1,52) = 1.30, p = 0.26) became nonsignificant. By contrast, in both cases, the correlation with ITC was highly significant across groups (tone matching: F(1,47) = 10.9, p = 0.002; P3: F(1,52) = 7.76, p = 0.007), suggesting that between-group differences in tone-matching thresholds and P3 generation were driven largely by between group deficits in delta entrainment.
Finally, inspection of the tone frequency discrimination ability versus the ITC curve across groups suggested a nonlinear relationship, reflecting the fact that the tone-matching ratio asymptotically approaches, but cannot reach, a value of 1 (i.e., no tone difference between standard and deviant). We therefore conducted a nonlinear curve estimation across groups. Best fit was achieved with a hyperbolic fit (tone-matching ratio vs inverse ITC), constrained to converge on a tone-matching ratio of 1. In this analysis, both ITC (F(1,48) = 31.5, p < 0.000001) and group membership (F(1,48) = 13.5, p = 0.001) contributed independently. Thus, highly significant ITC contributions to tone-matching impairments were found both within patients alone and in across-group analyses, even when contributions of group membership were explicitly modeled. Additional impairments above and beyond those related to delta entrainment deficits were also observed when nonlinear approaches (i.e., ANCOVA with differential slopes or hyperbolic fit) were used.
The nonlinear relationship between delta entrainment and tone-matching thresholds is a post hoc finding that must be confirmed in future studies. Nevertheless, it is consistent with the fact that threshold ratio must converge asymptotically on a value of 1. Furthermore, it may be explained by the fact that the entrainment of delta oscillations, as reflected in ITC, has multiple functional roles in auditory attentive processes in that it both enhances attended frequency content while at the same time suppressing auditory regions tuned to different frequencies, thereby sharpening the sensory representation of stimuli at key time points (Lakatos et al., 2013). Besides serving as a two-dimensional spectrotemporal filter, entrainment also stabilizes the responses to attended stimuli, which is critical for determining stimulus constancy and thus frequency deviations reliably. A third, very likely role of entrainment is to coordinate functional connectivity through coherence (Fries, 2005). Thus, diminished entrainment results in a multidimensional functional impairment of auditory perceptual processes.
Besides behavioral (tone discrimination) and electrophysiological (P3) correlations, we also found that reductions in delta ITC deficits correlated significantly with increased overall severity of symptoms as determined using the BPRS (r = −0.44, p = 0.017). Among subscales, a significant correlation was observed only with positive symptoms, which incorporates the items of hallucinations, delusions, and excitation (r = −0.44, p = 0.016).
The present findings have implications for both normal information processing and for cognitive impairments in disorders associated with prominent auditory dysfunction. We establish for the first time in humans that active predictive entrainment of rhythmic neuronal excitability fluctuations is systematically enhanced with increasing task demands, which demonstrates the importance of low-frequency entrainment mechanisms for normal cognition. In schizophrenia, failure of delta entrainment correlated highly with increased discrimination thresholds, attesting to the importance of this process for normal perceptual function.
Until recently, the alignment of internal excitability fluctuations to attended stimulus structure was interpreted as resulting in a predictive response gain at time points when attended stimuli occur (Lakatos et al., 2008; Stefanics et al., 2010; Besle et al., 2011). The predictive gamma-amplitude reduction we observe at the macroscopic level in humans appears to argue for the importance of suppression in active predictive sensing, at least in the frequency discrimination paradigm used here, which would benefit from sharper tuning of auditory cortical regions. By demonstrating that the same reductions are observable in mesoscopic local field potential recordings above cortical sites that are not tuned to the attended frequency content, our findings support the recent hypothesis that in the auditory system, active predictive sensing not only increases excitability within stimulus- and task-relevant regions, but also sharpens the representation of attended stimuli by downregulating the excitability of stimulus- or task-irrelevant ensembles (Lakatos et al., 2013). In schizophrenia, the failure of these predictive modulatory processes is likely to form the backbone of the disconnect between the external environment and the brain.
Predictive brain processes can be divided into two major categories: active (conscious) ones that facilitate the perception of sensory/motor events (like the one described here); and automatic (subconscious) ones whose main role is to suppress the perception of sensory/motor events. Interestingly, both active and automatic predictive sensory processes are diminished in schizophrenia (present results; Ford et al., 2010; Ford and Mathalon, 2012; Todd et al., 2012), which might indicate similar underlying mechanisms. Our results and those of Lakatos et al. (2013) suggest that active predictive auditory processes use ongoing oscillations entrained with their high excitability phases in neuronal ensembles processing attended frequency content at specific times, while the rest of the neuronal ensembles are suppressed by counter phase-entrained oscillatory activity at times when stimuli are predicted to occur. Even though, as our results suggest, the suppression of neuronal excitability predominates in auditory cortical regions at critical time points, this only serves to attenuate ignored frequency content and enhance the sensory representation of attended stimuli. As opposed to this, the goal of automatic or passive predictive processes is to suppress self-generated or frequently occurring nonrelevant items, and only signal whether a change is detected (Houde et al., 2002; Friston, 2005; Baldeweg, 2006; Bäss et al., 2008; Winkler et al., 2009; Chen et al., 2011; Costa-Faidella et al., 2011; Knolle et al., 2013). In these cases, an inverse pattern would be ideally suited as the mechanism: oscillations entrained with their suppressive phases could be centered on the spectrotemporal properties of self-generated or frequently occurring nonrelevant sounds, while the rest of auditory cortex could be entrained with their opposing, high-excitability phases. As a result, if a to-be-suppressed, predicted stimulus would change in its timing or frequency, it would fall outside the “spectrotemporal sweet spot” of suppression, and the response to it would be enhanced compared with correctly predicted stimuli, alerting higher-order brain regions. A study by Eliades and Wang (2005) does provide hints that counter phase excitability modulation might be involved in modulating vocalization-related activity in primate auditory cortex. They found that the firing of large, mostly inhibited units during, in some cases, rhythmic multiphrase vocalization is modulated in the opposite phase compared with all unit and multiunit activity recorded (Eliades and Wang, 2005, their Fig. 9), and that vocalization-related suppression was dominant in the supragranular layers, where entrained oscillatory activity is largest in amplitude (Lakatos et al., 2005, 2008, 2013). Theoretically, oscillatory entrainment is an ideal candidate for the mechanism of passive or automatic predictive processes, since most self-generated events (e.g., speech, walking, scratching) are rhythmic. Determining how high-frequency neuronal activity is predictively modulated in relation to the timing of these rhythmic events could provide important clues in trying to understand the mechanisms underlying automatic predictive suppression.
At present, neural mechanisms underlying the reactive and predictive modulation of ongoing neuronal oscillations (phase reset and entrainment respectively) remain to be determined. One of the proposed pathways is nonspecific thalamocortical inputs regulated by the reticular nucleus of the thalamus (TRN; Zikopoulos and Barbas, 2007; Lakatos et al., 2009). There is also rapidly accumulating evidence for the involvement of thalamocortical networks and the TRN dysfunction in the pathogenesis of schizophrenia (Martínez et al., 2008, 2012; Ferrarelli and Tononi, 2011; Pinault, 2011; Vukadinovic and Rosenzweig, 2012). Furthermore, treatment with a NMDA antagonist, such as the psychotomimetic drug phencyclidine reduces delta activity and increases ongoing gamma activity within thalamocortical loops (Hong et al., 2010; Kiss et al., 2011). Thus, the observed deficits in delta entrainment and gamma modulation may reflect underlying NMDA receptor dysfunction in the thalamocortical circuitry, consistent with glutamatergic theories of schizophrenia (Javitt and Zukin, 1991).
Overall, our results provide support for the newly emerging concept that rather than being a passive recipient of informational content, sensory cortices continuously mold themselves to take advantage of both temporal and informational predictability within the environment, as a way of optimizing information selection and processing. Furthermore, the breakdown of these mechanisms contributes significantly to impairments in the quality and efficiency of perceptual and cognitive function. Finally, at the macroscopic level of human surface EEG, reductions, rather than increases, in task-related high-frequency activity can serve as the best available indices of the efficiency and integrity of active information-processing mechanisms.
This work was supported by National Institutes of Health Grants DC010415 and DC012947 from the National Institute on Deafness and Other Communication Disorders, and Grants MH061989, MH060358, MH082790, MH086385, and MH49334 from the National Institute of Mental Health. The authors declare no competing financial interests.