|Home | About | Journals | Submit | Contact Us | Français|
Origins of impaired adaptive functioning in schizophrenia remain poorly understood. Behavioral disorganization may arise from an abnormal reliance on common combinations between concepts stored in semantic memory. Avolition-apathy may be related to deficits in using goal-related requirements to flexibly plan behavior. We recorded event-related potentials (ERPs) in 16 medicated schizophrenia patients and 16 healthy controls in a novel video paradigm presenting congruous or incongruous objects in real-world activities. All incongruous objects were contextually inappropriate, but the incongruous scenes varied in comprehensibility. Psychopathology was assessed with the Scales for the Assessment of Positive and Negative Symptoms (SAPS/SANS), and the Brief Psychiatric Rating Scale. In patients, an N400 ERP, thought to index activity in semantic memory, was abnormally enhanced to less comprehensible incongruous scenes, and larger N400 priming was associated with disorganization severity. A P600 ERP, which may index flexible object-action integration based on goal-related requirements, was abnormally attenuated in patients, and its smaller magnitude was associated with the SANS rating of impersistence at work or school (goal-directed behavior). Thus, distinct neurocognitive abnormalities may underlie disorganization and goal-directed behavior deficits in schizophrenia.
Deficits in adaptive goal-directed behavior contribute to disability in schizophrenia (Velligan et al., 1997; Poole et al., 1999). Behavioral abnormalities may include context-inappropriate commission errors that may appear bizarre and out of place (Andreasen, 1984b). These behaviors constitute part of the Disorganized Type of schizophrenia that also includes disorganized speech, or positive thought disorder (Liddle, 1987; American_Psychiatric_Association, 1994; Andreasen et al., 1995a). As has been proposed to explain the disorganized speech in schizophrenia (Maher, 1983; Goldberg and Weinberger, 1995; Aloia et al., 1996; Maher et al., 2005; Elvevag et al., 2007), behavioral disorganization may be a manifestation of an underlying abnormality of the neural activity mediating semantic memory. In addition, schizophrenia patients can experience severe treatment-refractory negative symptoms of avolition-apathy that include deficits in goal-directed behavior (Andreasen, 1984b; Kiang et al., 2003; van Reekum et al., 2005). A subset of avolition-apathy symptoms have been linked to abnormalities in cognitive operations necessary to construct the plan of actions (Rempfer et al., 2003; Levy and Dubois, 2006; Godbout et al., 2007; Gold et al., 2008). Particularly on more complex, non-routine tasks, such planning may depend on a neural system supporting a specific type of conceptual knowledge encoding goal-related requirements of behavioral actions (Sitnikova et al., 2008a; Sitnikova et al., 2008b). Execution and comprehension of behavior may share neural systems subserving real-world knowledge (Humphreys and Forde, 1998; Rizzolatti et al., 2001; Ruby et al., 2002), and the field of cognitive neuroscience has developed comprehension paradigms that allow us to study specific neurocognitive processes while controlling for confounding variables. Therefore, to investigate neurocognitive abnormalities in conceptual processing that may underlie behavioral dysfunctions in schizophrenia, the present study assayed event-related potentials (ERPs), which directly measure electrophysiological brain activity, while patients and controls comprehended real-world activities depicted in short video clips.
According to a prevailing theory, semantic memory stores information about a person's previous real-world experiences in a structured fashion (Bower et al., 1979; Brewer and Dupree, 1983; Hutchison, 2003; Zacks et al., 2007; Sitnikova et al., 2008b). Representations of individual concepts are thought to have connections of varying strength, depending on factors such as their feature similarity or how often they have been experienced in the same context. These graded semantic relationships are believed to be accessed and used in comprehension and behavior, particularly in familiar situations1. In the laboratory setting, they may account for faster processing of word and picture stimuli preceded by the semantically related (vs. unrelated) context (reaction time priming -- Stanovich and West, 1983; Fischler and Bloom, 1985). Of particular relevance, common relationships between actions and entities might be represented within such graded semantic networks. Comprehenders have been reported to process words faster (McRae et al., 2001; McRae et al., 2005) and preferentially look at real-world objects (Kamide et al., 2004) when the stimuli convey customary combinations between entities and actions, especially when the specific role that a target entity usually plays in an action is constrained by the context (Ferretti et al., 2001).
Neuroimaging during comprehension of words and visual images has implicated a broad network of cortical regions in processing common semantic relationships. Stimuli that cannot be easily mapped on the semantic memory networks accessed by the preceding context (vs. contextually-appropriate stimuli) evoke an increased response within the left inferior prefrontal and temporal cortices (e.g., Kotz et al., 2002; Friederici et al., 2003; Kuperberg et al., 2003a; Rossell et al., 2003; Simons et al., 2003; Cardillo et al., 2004; Giesbrecht et al., 2004; Blondin and Lepage, 2005; Wheatley et al., 2005; Kuperberg et al., 2008b, reviewed by Van Petten and Luka, 2006). Moreover, names or pictures of real-world objects and verbs conveying object-directed actions activate overlapping brain regions in the temporal, parietal and premotor cortices (Martin et al., 1996; Grafton et al., 1997; Grabowski et al., 1998; Chao et al., 1999; Moore and Price, 1999; Chao and Martin, 2000; Tyler et al., 2003; Bedny and Thompson-Schill, 2006; Kemmerer et al., 2007), possibly, because both of these stimulus types access semantic networks representing usual combinations of actions and entities.
Previous ERP studies have identified a negative-going ERP waveform, peaking at approximately 400 ms after stimulus presentation (the N400 – e.g., Kutas and Hillyard, 1980a, 1980b; Barrett and Rugg, 1990), that may reflect mapping of perceptual input on the semantic memory networks (Sitnikova et al., 2006; Sitnikova et al., 2008a). In healthy participants, the N400 is evoked during comprehension of language and visual images, and its amplitude is inversely correlated with the strength of semantic relationship between the eliciting stimulus and its preceding context2 (Kutas and Hillyard, 1980b, 1989; Federmeier and Kutas, 1999; McPherson and Holcomb, 1999; Federmeier and Kutas, 2001).
A large body of research has examined the utilization of semantic memory in schizophrenia patients. Despite some differences in findings, possibly due to factors such as variation in patients' symptomatology or medication, these studies suggest several general conclusions. Reaction time priming studies showed that, even though automatic spread of activation within semantic memory networks (e.g., when asynchronies between the context and target stimuli are short) may be relatively intact and lead to normal priming, deficits may exist in the strategic use of these networks (e.g., when the context-target asynchronies are longer -- Vinogradov et al., 1992; Barch et al., 1996; when comprehension requires inhibiting irrelevant information -- Titone et al., 2000; Titone et al., 2002). Neuroimaging suggests that these strategic deficits may be related to abnormally reduced activations within the left inferior prefrontal and temporal cortices in patients with schizophrenia (Han et al., 2007). In agreement with the reaction time studies, some ERP studies (e.g., Andrews et al., 1993; Niznikiewicz et al., 1997; Olichney et al., 1997; Sitnikova et al., 2002; Ruchsow et al., 2003; Kuperberg et al., 2006c; Kiang et al., 2007), but not all (e.g., Salisbury et al., 2000; Salisbury et al., 2002; Condray et al., 2003; Kostova et al., 2005; Kiang et al., 2008), found that the increase in the N400 to contextually incongruous (vs. congruous) target words was comparable between schizophrenia patients and healthy controls.
A subset of studies suggest that, at least under some experimental conditions, processing within semantic memory networks may be excessive in schizophrenia (Manschreck et al., 1988; Kwapil et al., 1990; Spitzer et al., 1993a; Spitzer et al., 1993b; Spitzer et al., 1994; Moritz et al., 2001; Mathalon et al., 2002; Moritz et al., 2003; Kreher et al., 2007). These studies found that, particularly in patients with positive thought disorder, reaction time and N400 attenuation for semantically related targets may be abnormally increased, and robust priming may even occur between context and target stimuli that are remotely related (e.g., ‘camel’ – ‘fox’) or indirectly related (e.g., ‘lemon’ is related to ‘sour’, and hence indirectly related to ‘sweet’). Interestingly, neuroimaging recordings under automatic processing conditions have provided some evidence for abnormally high levels of neuronal activation within the left inferior prefrontal and temporal cortices in schizophrenia that were correlated with positive thought disorder (Kuperberg et al., 2007a). In addition, the ERP data suggested that inappropriate activation within semantic memory in schizophrenia may disrupt sentence comprehension. The N400 in patients was abnormally attenuated to target words that were incongruous with the global preceding sentence context but were semantically related to a frequently used meaning of a homograph (a multi-meaning word) embedded within the context (Sitnikova et al., 2002). Finally, other ERP studies have found that schizophrenia patients may show enhanced efforts in mapping the incoming information on semantic memory networks. In patients, the amplitude of the N400 to target words was abnormally increased, independently of the congruence of the preceding context (Nestor et al., 1997; Niznikiewicz et al., 1997; Iakimova et al., 2005) and this increase was proportional to the severity of their positive thought disorder (Andrews et al., 1993).
Even though graded semantic networks may effectively facilitate comprehension and behavior in familiar circumstances, this form of knowledge representation appears too rigid to account for humans' remarkable ability to adaptively plan and interpret actions in less conventional contexts. For example, the concept of a ‘cutting’ implement, such as a knife, may be represented in graded semantic networks together with several of its common properties including <has handle>. This representation would have a limited value for planning a cutting action if only unconventional implements such as a plate or a tape measure (that do not have properties such as <has handle>) are available, say, at an office birthday party. We suggest that the ability to flexibly plan and comprehend actions, especially in non-routine situations, may depend on a distinct type of conceptual knowledge that selectively encodes information of what is required to achieve a specific goal (Sitnikova et al., 2008a; Sitnikova et al., 2008b). Unlike graded semantic networks, such more discrete, ‘rule-like’ requirements can be applied to novel combinations between actions and entities. For instance, a ‘cutting’ action necessitates that the cutting implement <have a sturdy sharp edge> and the object that is being cut be <unsturdy>. These goal-related requirements can be applied to determine that objects such as a plate or a stretched tape measure, which have a relatively sharp sturdy edge, can be used to cut objects such as a cake or lasagna, which are unsturdy.
Electrophysiological studies in nonhuman primates have obtained evidence that prefrontal (but not temporal) neurons display such a discrete pattern of response to categories of visual stimuli that are defined by their functional relevance (Freedman et al., 2001, 2002, 2003; reviewed by Miller et al., 2002; Miller et al., 2003). In healthy humans, neuroimaging suggested that the activity selectively within the dorsolateral prefrontal cortex (DLPFC) is sensitive to expertise in retrieving previously learned functionally-relevant attributes of novel complex shapes (Moore et al., 2006). Furthermore, neuroimaging has specifically linked the DLPFC activity to integration between entities and actions based on goal-related requirements. The left DLPFC was activated when participants decided whether two verbally described objects fit the requirements of a given action (Murray and Ranganath, 2007). A bilateral DLPFC activation was evoked by incompatible object-action combinations described in sentences (e.g., verbs such as ‘lit’ [vs. congruous comparison verbs] in sentences such as “To make the dinner more romantic the table had lit several candles” -- the action conveyed by the verb ‘lit’ cannot be completed given the properties of an entity described by the preceding subject noun ‘table’ -- Kuperberg et al., 2008a, see also Ni et al., 2000; Newman et al., 2001).
ERP studies of language comprehension in healthy participants have identified a positive-going waveform with a peak at approximately 600 ms after stimulus presentation (the P600) that may be sensitive to integration between actions and entities based on goal-related requirements. The P600 was originally described in language comprehension studies in response to syntactic ambiguities and anomalies (Osterhout and Holcomb, 1992; Osterhout et al., 1994), and was interpreted as indexing syntactic reanalysis and/or repair (Osterhout et al., 1994; Friederici, 1995). However, more recent studies showed that it is also modulated by violations of goal-related requirements of linguistically described actions (an increased P600 is evoked to verbs such as ‘lit’ in the ‘the table had lit’ example above, relative to congruous comparison verbs -- Kolk et al., 2003; Kuperberg et al., 2003b; Hoeks et al., 2004; Kim and Osterhout, 2005; van Herten et al., 2005; Kuperberg et al., 2006a; Kuperberg et al., 2006b; van Herten et al., 2006, reviewed by Kuperberg, 2007).
Some data are consistent with the hypothesis that schizophrenia patients have deficits in using goal-related requirements to effectively combine objects and actions. Schizophrenia patients showed a disproportionate decline in accuracy, relative to healthy controls, when judging acceptability of sentences with violations of goal-related requirements (e.g., ‘the table had lit’) as compared to sentences describing predictable or unexpected but possible situations (Kuperberg et al., 2006c). Furthermore, the P600 effect to verbs violating goal-related action requirements (vs. congruous comparison verbs) in sentences was attenuated in patients relative to controls (Kuperberg et al., 2006c). These findings are intriguing, but as they were obtained using language stimuli, a possibility cannot be ruled out that they reflect impaired use of syntactic constraints in schizophrenia (Ruchsow et al., 2003; Kuperberg et al., 2006c). During sentence comprehension, processing of syntactic constraints is an integral part of the integration between the described actions and entities (Kaan et al., 2000). Patients might have inadequately engaged the syntactic processing, and as a result, made errors in sentence interpretation and evoked a reduced P600 effect.
The present study examined how schizophrenia patients utilize graded semantic memory networks and discrete goal-related action requirements during comprehension of real-world activities conveyed in short, silent video clips. Video stimuli have multiple advantages: they can engage patients' attention while preserving naturalistic conceptual processing (Levin and Simons, 2000), they can present events in a short time frame avoiding confounding influences from impaired context maintenance (Cohen and Servan-Schreiber, 1992; Cohen et al., 1999; Barch et al., 2001; Salisbury et al., 2002; Barch et al., 2003; Holmes et al., 2005), and they can eliminate the effects of certain non-semantic deficits (e.g, in use of syntactic constraints). An ERP technique was used to track fast neurocognitive processes, with a millisecond resolution, as they were spontaneously engaged during comprehension. By combining the video and ERP techniques we aimed to assay naturalistic neurocognitive processes, enhancing generalizability of our findings to unstructured and unsupervised goal-directed tasks in real life (see Kiang et al., 2003 for findings that apathy in schizophrenia may be specifically associated with the real-life goal-directed behaviors but not more structured, laboratory goal-directed tasks).
We have designed a video paradigm that modulates processing demands on the two neurocognitive mechanisms of interest (Sitnikova et al., 2003; Sitnikova et al., 2008a; Sitnikova et al., 2008b). In this paradigm, video clips convey common goal-directed activities that end either with a congruous or incongruous final scene. For example, in one scenario, the lead-up context depicts a man set up an ironing board and place a pair of pants on the board. In the congruous condition, the final event involves the man ironing the pants with an electric iron (see Figure 1A; also see Figure 1C for another congruous example). In the incongruous condition, a target object in the final scene is both semantically unrelated to the contextual activity and violates goal-related requirements of the expected action: the man moves a dinner fork across the pants (Figure 1B; the dinner fork is semantically unrelated and does not have a flat hot surface necessary for ironing pants; also see Figure 1D for another incongruous example).
Our previous studies in healthy participants found that incongruous (vs. congruous) video endings evoke two types of ERP responses (Sitnikova et al., 2003; Sitnikova et al., 2008a). These responses are similar to the ERPs evoked to violations of context-based expectancies or goal-related action requirements during language comprehension, but are more prolonged, possibly due to extended presentation of the incongruous information as actions unfold over time (Sitnikova et al., 2008b). Incongruous endings in videos produce an increased N400, accompanied by a visual-image specific N3003, indicating difficulties in mapping these scenes on semantic memory networks. In addition, the incongruous endings evoke a larger P6004.
Our previous work provides evidence for sensitivity of the P600 in videos specifically to violations of goal-related action requirements, suggesting that it reflects an effort in evaluating the video scenes against the discrete, rule-like knowledge of what is necessary to achieve the goal of the contextual activity (Sitnikova et al., 2008a). Relative to congruous final video scenes, contextually-inappropriate endings that did not violate the goal-related requirements of their central action (e.g., after placing a cutting board and a loaf of bread on a kitchen counter, a man uses an electric iron to press wrinkles from his pants) produced an enhancement in the N400 rather than the P6005. Increased dissimilarity of the background information between the context and the final scene also led to an augmentation of the N400 but not the P600. Our earlier findings also suggest a clear distinction of the P600 evoked in videos from the P300 – a waveform of the same-polarity that is thought to index domain-general strategic processes (Donchin and Coles, 1988). Lack of the P600 modulation by the behavioral task performed by participants (e.g., passive viewing vs. classifying the videos into congruous/incongruous -- Sitnikova, 2003; Sitnikova et al., 2003), and by task-relevant video anomalies other than violations of goal-related action requirements (Sitnikova et al., 2008a) is inconsistent with the pattern of modulation expected for the P300 (Polich, 1986; Picton, 1992).
Detection of neurocognitive processes evoked by visual events depends on consistent ERP time-locking to these processes across individual trials. To accomplish this, our video paradigm separates the context and final scenes in each clip by a cinematographic cut (Bobker and Marinis, 1973), and presents all critical information (a fully visible target object as it is engaged into the scene's central action) at the onset of each final scene. ERP recordings are time-locked to the scene onset. This ‘cutting’ technique does not appear to disrupt naturalistic comprehension: the pattern of ERPs recorded while healthy participants viewed videos with cuts was generally comparable to ERPs time-locked to the first discernable appearance of the target object in video clips that continuously showed the same real-worlds activities (cf. Sitnikova et al., 2003 vs. Sitnikova et al., 2008a).
The current study recorded ERPs while patients with schizophrenia and matched healthy control participants viewed video clips ending with congruous final scenes or incongruous scenes with violations of goal-related action requirements. We reasoned that, if patients are able to engage their semantic memory networks during comprehension of goal-directed activities, they would show an enhancement of the N400 to incongruous (vs. congruous) video endings. Any abnormally heightened activity within semantic memory networks in patients was expected to lead to an increased N400 priming effect (reduced N400 to congruous video endings). This hyperactivity, and the associated N400 priming magnitude, were expected to correlate with the severity of patients' disorganization symptoms. In addition, we hypothesized that patients would inadequately recruit the goal-related requirements when integrating between actions and objects, and hence, would evoke an abnormally reduced P600 effect to incongruous (vs. congruous) video endings. If it contributes to patients' goal-directed behavior deficits, the severity of this symptomatology was expected to be negatively correlated with the magnitude of their P600 effect. Furthermore, we aimed to examine whether this P600 deficit would be more pronounced when using goal-related action requirements in comprehension is more demanding. Greater impairments were expected for the incongruous actions that were more difficult to make sense of given the properties of the engaged objects (e.g., Figure 1B: the dinner fork does not fulfill goal-related requirements for any conceivable goal-directed action) relative to those that could be comprehended more easily (e.g., Figure 1D: a man gets ready to cut bread, and then slides an electric iron across the loaf of bread: the electric iron fulfills goal-related requirements for defrosting or warming bread up).
Sixteen patients with schizophrenia were recruited from the Lindemann Community Mental Health Center, Boston, Massachusetts and sixteen healthy volunteers were recruited by advertisement. All patients met DSM-IV criteria for schizophrenia. This diagnosis was confirmed by a research psychiatrist using the Structured Clinical Interview for DSM-IV (First et al., 2002b), and was reviewed in each case by a consensus diagnostic conference based on results from a thorough chart examination and review of clinical history with treating clinicians. Healthy volunteers were screened to exclude the presence of psychiatric disorders using the Structured Clinical Interview for DSM-IV, Non-patient Edition (First et al., 2002a). All participants were right-handed (Oldfield, 1971; White and Ashton, 1976), and had normal or corrected-to-normal vision. Participants were excluded if they had a history of neurological damage, head trauma with documented cognitive sequelae, and medical disorders that can impair neurocognitive function, as well as if they met DSM-IV criteria for substance abuse within the previous 3 months or substance dependence any time within their life span (assessed as a part of the Structured Clinical Interview for DSM-IV). Written informed consent was obtained from all persons before participation according to the established guidelines of the Massachusetts General Hospital and Tufts Human Subjects Research Committees.
Patients' symptomatology was assessed within one week of ERP testing using the Scales for the Assessment of Positive and Negative Symptoms (SAPS -- Andreasen, 1984b; SANS -- Andreasen, 1984a), and the 18-item Brief Psychiatric Rating Scale (BPRS -- Overall, 1974). All evaluations were completed by a single research associate in clinical neuropsychology who underwent extensive training in the administration of these scales and established 85% inter-rater agreement with the goldstandard ratings (determined by consensus of 2 doctoral level clinicians) on 10 videotaped/live interviews, and thereafter, took part in reassessments with a new videotaped interview every 6 months to maintain reliability. Previous studies of these clinical scales have also reported good inter-rater reliabilities for the assessments of the individual symptoms (Ventura et al., 1993; Toomey et al., 1997; Schutzwohl et al., 2003), including SAPS/SANS single item ratings (Peralta and Cuesta, 1995; Toomey et al., 1997; Peralta and Cuesta, 1999). Consistent with previous studies (Barch et al., 2003; Yoon et al., 2008), scores on three global factors reported in these scales (Gur et al., 1991; Phillips et al., 1991; Van der Does et al., 1993; Brekke et al., 1994; Andreasen et al., 1995a; Harvey et al., 1996) were used to delineate patients' clinical profiles: (1) Reality Distortion (Cronbach's α = .77), including grandiosity, suspiciousness, hallucinations, and unusual thought content from the BPRS and hallucinations and delusions from the SAPS; (2) Disorganization (Cronbach's α = .65), including conceptual disorganization, mannerisms and posturing, and disorientation from the BPRS and attention, positive formal thought disorder, and bizarre behavior from the SAPS/SANS; (3) Poverty Symptoms factor (Cronbach's α = .83), including emotional withdrawal, motor retardation, and blunted affect from the BPRS and anhedonia/asociality, avolition-apathy, alogia, and affective flattening from the SANS. In addition, we specifically used the evaluation of the impersistence at work or school from the SANS to quantify patients' deficits in goal-directed behavior in real life as this item arguably is most directly related to poor cognitive abilities necessary in forming adaptive plans of action. Consistent with this notion, abilities to perform real-world tasks have been previously found to load on distinct factors depending on their complexity (e.g., complex behaviors: shopping, housework, and meal preparation; basic routines: toileting, dressing, and grooming -- Thomas et al., 1998). Moreover, in several factor solutions of the SAPS and SANS (Keefe et al., 1992; Minas et al., 1994; Peralta and Cuesta, 1999), the impersistence at work or school loaded specifically onto the factor of social dysfunctions, possibly because both occupational and social activities require cognitive abilities to flexibly build complex behaviors. In contrast, the physical anergia and the impairments in grooming and hygiene, which are included in the SANS avolition-apathy scale but are only moderately correlated with the impersistence at work or school (rs between .37 - .56 -- Keefe et al., 1992; Peralta and Cuesta, 1995), yielded significant loadings on the poverty symptoms and/or disorganized behavior factors. To determine whether any findings involving the impersistence at work or school can be accounted for by more general symptoms of apathy or social dysfunctions, we used the SANS global scores on avolition-apathy and anhedonia-asociality, respectively. We also examined possible mediating effects of general anxiety and depression symptoms by using a BPRS composite score on a previously reported Anxiety/Depression factor, including anxiety, guilt, depression, tension, and somatic concern items (Overall and Klett, 1972; Guy, 1976; Harvey et al., 1996; Ruggeri et al., 2005).
Patients and controls were closely matched on all demographic variables. There were no significant differences between the groups in gender or race distribution, age, education level, parental socioeconomic status, as assessed by Hollingshead Index (Cirino et al., 2002), and IQ, as assessed by the North American Adult Reading Test (Blair and Spreen, 1989). Patients were receiving stable doses of antipsychotic medication for at least four weeks before the ERP study [Clozapine (N=6), Olanzapine (N=4), Risperidone (N=4), Quetiapine (N=1), Chlorpromazine (N=1), Fluphenazine (N=5), Haloperidol (N=1)], with some patients taking more than one typical and/or atypical neuroleptic. Eight of the patients were being treated with atypical antipsychotics only, and two of the patients were being treated with typical antipsychotics only. Healthy volunteers were on no medication. Demographic characteristics of all participants and clinical details for the patient group are given in Table 1.
80 pairs of color video clips conveyed typical goal-directed activities (e.g., cooking, shaving, etc.) that ended either with a congruous or incongruous final event (see Figure 1). All video clips included two or more simple real-world events presented as a context and followed by a final scene showing the main actor manipulating a target object. The incongruous video endings were constructed by substituting the appropriate target object (e.g., Figure 1A: an electric iron) with another target object (e.g., Figure 1B: a dinner fork) that was used in the congruous ending in another scenario. The target object in the incongruous scenes was both semantically unrelated to the video context and did not have semantic properties required for the central action constrained by the preceding events (Sitnikova et al., 2008a). In each pair of video clips, the same context was used with either a congruous or incongruous target object. The incongruous clips were assigned to two subsets depending on how difficult it was to understand the goal of the main actor in their final event. This was determined in a pre-test experiment: A group of 15 healthy participants (7 women and 8 men, with mean age of 21, who did not take part in the ERP study) viewed each incongruous clip, and were asked to interpret the goal of the main actor in the final scene (if they could not conceive of any reasonable goal of the target action, they were instructed to report “do not know”). The incongruous clips were then subdivided with a median split on the proportion of responses that provided a goal explanation for the target action in a given scenario (the rates were 68.2 +/- 10.5% for more comprehensible incongruous scenes and 18.6 +/- 16.8% for less comprehensible scenes). For example several viewers were willing to offer an explanation for the scene showing an electric iron being moved across a loaf of bread (Figure 1D), but not for the scene showing a dinner fork being moved across a pair of pants on an ironing board (Figure 1B).
In each clip, context and final shots were separated by a cinematographic cut (details on using the ‘cutting’ technique can be found elsewhere -- Sitnikova et al., 2008a). All target objects (e.g., iron/knife/fork) were clearly visible and were engaged into the central action (e.g., cutting/sliding) at the onset of the final scene, but did not appear in the clip before the final scene. Target scenes were identified using a red frame around the video display.
The clips were arranged into two sets, each consisting of 40 congruous and 40 incongruous items (equal numbers of more and less comprehensible incongruous final scenes). The assignment of clips to sets was such that no context or final scene shot was included twice in one set, although across sets all contexts and all target objects appeared in both the congruous and incongruous conditions. Half of the participants in each diagnosis group viewed videos from set 1 and half viewed videos from set 2.
Video clips, subtending 4° of visual angle and centered on a black background, were shown without sound at a rate of 30 frames per second and ranged between 4-29sec in duration (mean = 11sec); the final scene lasted 2sec. Participants progressed across the clips at their own pace, and were instructed to keep their eyes in the center of the screen. After the ‘?’ prompt that appeared 100 ms after the offset of each clip, participants were asked to press a ‘Yes’ or ‘No’ button depending on whether the clip presented a congruous event sequence that would commonly be witnessed in everyday life. Six additional clips were used in a practice session.
The electroencephalogram (bandpass, .01 to 40 Hz, 6dB cutoffs; sampling rate, 200 Hz) was recorded from 29 scalp electrodes (Electro-Cap International, Inc., see diagram in Figure 2), below and at the outer canthus of an eye, and over the right mastoid; all recordings were referenced to the left mastoid. ERPs (epoch length = 100 ms before to 1170 ms after the final scene onset) were selectively averaged among trials from each condition that were correctly classified as congruous/incongruous and were free of ocular artifacts (activity > 60 μV at eye electrodes): for congruous videos, 82.1% of trials were included in controls, and 77.0% of trials were included in patients; for incongruous videos, 79.2% of trials were included in controls, and 75.3% of trials were included in patients. In addition, separate mean ERPs were formed for the more and less comprehensible subsets of the incongruous video trials as well as for subsets of the corresponding congruous video trials with the same contextual background (e.g., ERPs to the sample incongruous video in Figures 1B and its congruous counterpart in Figure 1A were included in the less comprehensible incongruous and congruous subsets, respectively; whereas the sample videos in Figures 1D and 1C were included in the more comprehensible incongruous and congruous subsets, respectively). We used these separate subsets of congruous videos in the control condition because our previous findings in healthy participants suggest that the extent of background changes between the context and final video scenes may influence the N400 (Sitnikova et al., 2008a). The average ERPs were re-referenced to a mean of the left and right mastoids.
Group differences in behavioral accuracy were examined using an independent-samples t-test, and an estimate of effect size was obtained by using Cohen's d.
ERPs were quantified by calculating the mean amplitudes (relative to the 100 ms baseline preceding the final scene onset) within time-windows of interest. Two time-windows (0-200 ms, 225-325 ms) were used to examine early sensory/perceptual and the N300 waveforms, respectively. To quantify the N400 and P600 waveforms, we used 325-525 ms and 600-1000 ms time-windows, respectively. These time-windows roughly corresponded to those used in our previous studies with healthy participants (Sitnikova et al., 2003; Sitnikova et al., 2008a). The data were examined using mixed-design analyses of variance (ANOVAs), and estimates of effect size were obtained by using the partial Eta squared. The Geisser-Greenhouse correction was applied to repeated measures with more than one degree of freedom (Geisser and Greenhouse, 1959) and a significance level of alpha = .05 was used as, in all cases, a priori hypotheses were tested.
For each time-window, two omnibus ANOVAs were conducted in order to examine ERP differences between congruous and incongruous videos at midline and lateral sets of scalp regions. Figure 2 shows within-participant scalp topography factors used in these analyses. Each analysis also had a within-participant factor of Congruence (congruous vs. incongruous video endings) and a between-participant factor of Group (controls vs. patients). Significant interactions involving the Congruence and Region factors were parsed by assessing the ERPs at each level of the Region factor. After that, significant Congruence × Group interactions were parsed in two ways: first, by examining the effect of Congruence in each participant group, and second, by examining the effect of Group for each type of videos.
A secondary analysis that evaluated the effects of comprehensibility of the incongruous video endings used additional two omnibus ANOVAs for each time-window. These ANOVAs were identical to the ones described above but included an additional within-participant factor of Comprehensibility. ERP differences between the more and less comprehensible incongruous videos were examined while using their congruous counterparts in the control condition. Thus, ‘more comprehensible’ level included more comprehensible incongruous videos and their congruous counterparts, and ‘less comprehensible’ level included less comprehensible incongruous videos and their congruous counterparts. Significant interactions involving the Comprehensibility, Congruence, and Region were parsed by assessing the ERPs at each level of the Region factor. In addition, interactions involving the Comprehensibility, Congruence, and Group factors were parsed by assessing the ERPs in each participant group, and interactions involving the Comprehensibility and Congruence factors were parsed by assessing the ERPs at each level of the Comprehensibility. After that, significant Congruence × Group interactions were parsed in two ways: first, by examining the effect of Congruence in each participant group, and second, by examining the effect of Group for each type of videos.
Spearman two-tailed correlations were used to examine relationships of clinical symptoms (shown in Table 1) to the ERP findings. The N300 and N400 amplitudes were averaged across three electrode sites within the Frontal Midline region, and the P600 amplitude was averaged within the Parietal Midline region -- in our earlier studies, these scalp areas showed maximal differences (Sitnikova et al., 2003; Sitnikova et al., 2008a). For a priori hypotheses about the relationships between the disorganization symptomatology and the N400 effect and between goal-directed behavior deficits and the P600 effect, alpha = .05 was used. For other, more exploratory correlations, computed to establish specificity of any findings, the significance level was determined based on the Bonferroni correction. Spearman correlations of the behavioral accuracy in classifying the videos into congruous/incongruous with the ERP effects and clinical measures were also examined.
Both participant groups were highly accurate in classifying the videos into congruous/incongruous. The accuracy rate was 94.2 +/- 5.4% in controls and 90.5 +/- 7.1% in patients (t (30) = -1.680, p > .1, Cohen's d = .587).
The grand average ERPs time-locked to the presentation of congruous and incongruous video endings are plotted in Figure 3A in the control group, and in Figure 3B in the patient group. In the early sensory/perceptual time-window (0-200 ms), no significant differences were observed (ps > .1). The results of omnibus ANOVAs and planned comparisons involving Congruence and Group factors within each scalp region are presented in Table 2, and estimates of effect size are presented in Table 6.
ERP patterns were similar across the N300 (225-325 ms) and N400 (325-525 ms) epochs. In both participant groups, ERPs were more negative to incongruous (vs. congruous) video endings (main effect of Congruence in Midline and Lateral Omnibus ANOVAs). These negativity effects were primarily evident over anterior and central scalp regions (Congruence × Region interaction in Midline and Lateral Omnibus ANOVAs), and reached significance in several regions (Midline and Lateral planned comparisons by Region). None of these Congruence effects interacted with the Group factor, and there were no main Group effects or Group × Region interactions (ps > .1).
In the P600 epoch (600-1000 ms), the ERPs became strikingly different between the participant groups (Congruence × Group and Congruence × Region × Group interactions in Midline and Lateral Omnibus ANOVAs). The negativity effect continued only at the Midline Anterior-frontal Region in both participant groups (planned comparisons by Region). Over more posterior scalp regions, incongruous (vs. congruous) video endings evoked a positive-going effect that was dramatically larger in controls than patients (Congruence × Group interaction in Midline and Lateral planned comparisons by Region). These between-group differences were further parsed in planned comparisons presented in Table 3. The positivity effect reached significance in several posterior regions in controls (effect of Congruence in Midline and Lateral planned comparisons by Group) but was not significant in patients (ps > .1). This Congruence effect was attenuated in patients due to less positive ERPs to incongruous video endings (effect of Group in Midline and Lateral planned comparisons by Congruence); ERPs were not different between the participant groups in the congruous condition (ps > .1).
The grand average ERPs time-locked to the presentation of the more and less comprehensible incongruous video endings (plotted together with the ERPs for their respective congruous counterparts) are shown in Figures 4A&B separately for the control and patient groups. In the early sensory/perceptual time-window (0-200 ms), no significant differences were observed (ps > .1). The results of omnibus ANOVAs and planned comparisons for the N300 and N400 time-windows are presented in Table 4, and estimates of effect size are presented in Table 6.
During the N300 epoch (225-325 ms), the anterior negativity effect in control participants was evident to the more comprehensible but not the less comprehensible incongruous final scenes (vs. their congruous counterparts), whereas the difference between these negativity effects was in the reverse direction in patients (Comprehensibility × Congruence × Region × Group interaction in Midline and Lateral Omnibus ANOVAs). These differences reached significance only in the Midline Anterior-frontal Region (Comprehensibility × Congruence × Group interaction in planned comparisons by Region), where the comprehensibility of the incongruous scenes influenced the negativity effect in the patient group (Comprehensibility × Congruence interaction in planned comparisons by Group). In this region, the negativity effect to the less comprehensible incongruous (vs. congruous) final scenes was larger in patients than in controls (Congruence × Group interaction in planned comparisons by Comprehensibility).
In the N400 epoch (325-525 ms), the pattern of comprehensibility effects continued to be markedly different between the participant groups (Comprehensibility × Congruence × Region × Group interaction in Midline and Lateral Omnibus ANOVAs); these differences reached significance at several anterior scalp regions (Comprehensibility × Congruence × Group interaction in Midline and Lateral planned comparisons by Region). At these anterior regions, controls evoked a negativity effect only to the more comprehensible incongruous (vs. congruous) video endings (Comprehensibility × Congruence interaction in Midline and Lateral planned comparisons by Group). In contrast, patients evoked comparable negativity effects both to the more and less comprehensible incongruous final scenes (vs. their congruous counterparts; effect of Congruence in Midline and Lateral planned comparisons by Group). The negativity effect was comparable between the participant groups in the more comprehensible condition (effect of Congruence in Midline and Lateral planned comparisons by Comprehensibility). However, significant between-group differences were present in the less comprehensible condition (Congruence × Group interaction in Midline and Lateral planned comparisons by Comprehensibility); ERPs evoked by the less comprehensible incongruous endings were more negative in patients than in controls (effect of Group in Midline planned comparisons by Congruence), whereas ERPs evoked by their congruous counterparts were not different between the participant groups (ps > .1).
The results of omnibus ANOVAs and planned comparisons for the P600 time-window are presented in Table 5, and estimates of effect size are presented in Table 6. In the P600 epoch (600-1000 ms), the negativity effect to the more comprehensible incongruous (vs. congruous) final scenes was larger than the negativity effect to the less comprehensible incongruous (vs. congruous) final scenes, whereas modulation of the positivity effect was in the reverse direction (Comprehensibility × Congruence interaction in Midline and Lateral Omnibus ANOVAs). The negativity effect to the more comprehensible incongruous (vs. congruous) scenes was comparable between the participant groups (Congruence × Region interaction in Midline and Lateral planned comparisons by Comprehensibility), reaching significance at several anterior scalp regions (effect of Congruence in Midline planned comparisons by Region). However, the positivity effect to the less comprehensible incongruous (vs. congruous) scenes was primarily evident in controls rather than patients (Congruence × Group interaction in Midline and Lateral planned comparisons by Comprehensibility), and reached significance only in controls (effect of Congruence in Midline and Lateral planned comparisons by Group). This positivity effect in controls was larger over centroparietal scalp regions (Congruence × Region interaction in Midline and Lateral planned comparisons by Group), but was wide-spread and reached significance at several scalp regions (effect of Congruence in Midline planned comparisons by Region). This positivity effect was attenuated in patients due to less positive ERPs to the less comprehensible incongruous video endings (effect of Group in Midline and Lateral planned comparisons by Congruence); ERPs were not different between the participant groups in the congruous condition (ps > .1).
In patients, the Disorganization factor score correlated negatively both with the N300 (r = -.512, p < .05) and N400 effects (r = -.645, p < .01) to incongruous (vs. congruous) video endings: the higher the Disorganization score, the larger, more negative, the N300 and N400 effects (Figure 5A illustrates this relationship for the N400 effect). These results appeared to stem primarily from increased priming (reduced, less negative N300/N400) in the congruous condition: the Disorganization score correlated with the N300 and N400 amplitudes evoked to congruous (r = .765, p < .01; r = .578, p < .02, respectively) but not to incongruous video clips (r = .289, p > .1; r = .133, p > .1, respectively). The Disorganization factor did not significantly correlate with the P600 effect (r = -.457, p > .07; note that this trend toward a correlation apparently stemmed from the ERP response in the congruous condition, where there was a correlation trend with the Disorganization score – r = .470, p = .07; this pattern is distinct from the analyses between the P600 effect and impersistence at school or work, see below).
Patients' score on impersistence at school or work was correlated negatively with the P600 effect to incongruous (vs. congruous) video endings (r = -.508, p < .05): the higher the impersistence score, the smaller, less positive, the P600 effect – Figure 5B. This correlation appeared to stem primarily from the P600 response in the incongruous condition (r = -.457, p = .07) rather than the P600 response in the congruous condition (r = .078, p = .77). A similar pattern of results was also observed in the secondary analysis with the P600 effect to the less comprehensible incongruous (vs. congruous) final scenes: impersistence at school or work correlated negatively with the P600 effect (r = -.604, p < .02; Figure 5C), and the P600 response in the incongruous (r = -.510, p < .05) but not congruous condition (r =.146, p > .1). Impresistence at school or work was not correlated with the N300 and N400 effects (r = -.380, p > .1, r = -.104, p > .1, respectively).
The above N400 and P600 correlations were highly selective. Correlations were not significant between these ERP effects and scores on the Reality Distortion (r = .264, p > .1; r = .171, p > .1, respectively) and Poverty Symptoms factors (r = .286, p > .1; r = .210, p > .1, respectively). Correlations were also not significant with the SANS global ratings of avolition-apathy (r = .165, p > .1; r = .224, p > .1, respectively) and anhedonia-asociality (r = .319, p > .1; r = -.023, p > .1, respectively), and the BPRS composite score on the Anxiety/Depression factor (r = .342, p > .1; r = .216, p > .1, respectively).
Finally, in both participant groups, accuracy in classifying the videos into congruous/incongruous was not correlated with the N400 and P600 ERP effects (controls: r = .121, p > .1; r = -.047, p > .1, respectively; patients: r = -.144, p > .1; r = .208, p > .1, respectively). This behavioral accuracy also did not correlate with patients' scores on the Disorganization factor and impersistence at school or work (r = .119, p > .1; r = -.336, p > .1, respectively).
The current study presents evidence for specific neurocognitive abnormalities that may underlie dysfunctional conceptual processing in schizophrenia during comprehension of complex, naturalistic goal-directed behaviors presented in video clips. The video clips ended either with a congruous or incongruous final scene. When comprehending the incongruous scenes, increased demands on the neurocognitive process mapping the input on graded semantic memory networks were assumed to modulate the N400, and increased demands on the neurocognitive process integrating perceived actions and entities based on goal-related requirements were assumed to modulate the P600. As similar neurocognitive mechanisms mediating conceptual processes/representations may be utilized both in comprehension and execution of real-world behaviors (Humphreys and Forde, 1998; Rizzolatti et al., 2001; Ruby et al., 2002), we reasoned that abnormalities revealed during comprehension may help to understand patients' clinical behavioral symptoms. The results revealed that in schizophrenia patients, increased N400 priming was correlated with the severity of disorganization symptoms, assessed using a composite score on a subset of the SAPS, SANS, and BPRS ratings: the more severe the disorganization, the smaller (less negative) the N400 to congruous scenes, the larger the N400 effect to incongruous (vs. congruous) scenes. In addition, the P600 to incongruous video endings and the P600 effect to incongruous (vs. congruous) endings were strongly attenuated in patients relative to healthy controls. In patients, the P600 effect was inversely correlated with the severity of goal-directed behavior deficits, assessed using the SANS score of impersistence at school or work: the more impaired the goal-directed behavior, the smaller (less positive) the P600 effect. Finally, a secondary analysis revealed that neurocognitive abnormalities in schizophrenia were especially pronounced during viewing incongruous scenes that were harder to make sense of based on the goal-related action requirements. The less comprehensible incongruous (vs. congruous) scenes evoked a P600 effect in controls, but an N400 effect in patients.
The N400 effect to incongruous (vs. congruous) video endings was evident in schizophrenia patients. Taken together with similar earlier reports in language comprehension (e.g., Grillon et al., 1991; Niznikiewicz et al., 1997; Sitnikova et al., 2002; Iakimova et al., 2005; Kuperberg et al., 2006c), this suggests that, in general, patients are able to use their real-world knowledge, accessed from graded semantic memory networks, in comprehension. Interestingly, we also found no evidence for any delay of this processing in schizophrenia: the negative-going ERP effect to the incongruous (vs. congruous) video endings was not different between controls and patients even in the earlier N300 time-window6. However, our study sample was relatively small (16 patients and 16 controls), and this might have prevented us from detecting certain more subtle between-group differences.
Within the patient group, the relationship between the severity of disorganization symptoms and an increased N400 priming during real-world comprehension extends previous findings linking positive thought disorder with abnormally enhanced reaction time and N400 modulation to verbal targets in the priming paradigms (e.g., Kwapil et al., 1990; Spitzer et al., 1993b; Mathalon et al., 2002; Moritz et al., 2003; Kreher et al., 2007). Consistent with many of these previous hyperpriming reports, the N400 attenuation to the congruous video endings did not significantly differ between the overall schizophrenia group and healthy controls, but rather was linked specifically to disorganization symptomatology. Just as the priming paradigms with short context-target asynchronies, videos that deliver rapid and naturalistic sequences of images may automatically engage processing within the semantic memory networks. Taken together, these findings suggest that in disorganized patients, increased automatic activity within semantic memory networks may abnormally influence function both in the verbal and non-verbal domains. In everyday behavior, this hyperactivity might lead to intrusions into the performed activities of goal-inappropriate but semantically related actions or entities (Andreasen, 1984b).
It is unlikely that the reduced P600 modulation in schizophrenia patients was related to a general lack of attention to the content of the videos, because accuracy in judging congruence of each scenario was comparable between patients and controls. The lack of attention account is also unlikely given that the N400 modulation to incongruous (vs. congruous) target stimuli, known to be influenced by depth of conceptual processing (Bentin et al., 1993; Chwilla et al., 1995), was not reduced in patients relative to controls (both in our main and secondary analyses).
The abnormally reduced P600 effect to incongruous (vs. congruous) video endings in patients was due to a smaller P600 to the incongruous video endings, suggesting under-recruitment of goal-related requirements for integration between actions and entities during real-world comprehension. Moreover, the abnormally reduced P600 effect but increased N400 effect to the less comprehensible incongruous (vs. congruous) scenes in the schizophrenia group indicate that, unlike healthy controls who attempted adaptive integration of these difficult to comprehend videos based on goal-related requirements, patients might have inappropriately relied on the more rigid mapping of these scenes on the graded semantic networks.
The association between the severity of patients' impersistence at work or school and the reduced P600 modulation supports our hypothesis that goal-directed behavior deficits in schizophrenia may arise from impaired use of goal-related requirements to flexibly and effectively combine objects and actions. Nonetheless, patients' scores on the SANS global avolition-apathy scale were not correlated with the P600 effect in our video paradigm, possibly because using goal-related requirements may be more important for complex real-life activities, such as school or work tasks, than for basic personal hygiene routines that also are assessed by the SANS global avolition-apathy scale. These findings are interesting, but they should be treated with caution due to limited reliability of the SANS individual ratings. In future studies, it will be important to re-examine this result by using more comprehensive assessment methods of avolition-apathy symptoms (e.g., the Apathy Evaluation Scale -- AES -- Kiang et al., 2003) and patients' functional capacity on real-world tasks (e.g., the UCSD Performance-based Skills Assessment – UPSA -- Patterson et al., 2001). It will also be essential to directly test the relationship between the video P600 and abilities of schizophrenia patients to design unconventional actions with real objects, such as to cut a cake with a tape measure. Previously, standardized tests of instrumental behaviors simulated in the clinic have proven effective in elucidating how neurocognition contributes to the capacity of patients to function in the real-world community settings, in the absence of such intervening variables as social cognition or employment rates (Green, 2007).
The P600 effect in healthy participants evoked in the present study is in line with previous reports of a P600 to linguistically described incompatible action-entity combinations (e.g., Kolk et al., 2003; Kuperberg et al., 2003b; Kim and Osterhout, 2005; van Herten et al., 2006; Kuperberg et al., 2007b). Interestingly, combinations of subject nouns and verbs which were easier to integrate based on goal-related action requirements (e.g., ‘the paragraph would write’ – the paragraph does not fulfill the requirements for the entity that can write, but it fulfills the requirements for the entity that can be written) evoked a smaller P600 effect than the combinations which were harder to integrate (e.g., ‘the desk would write’ – the desk does not fulfill the requirements for either the entity that can write or the entity that can be written -- Kuperberg et al., 2006a; Sitnikova et al., 2008b). Similarly, in the present video study, healthy controls evoked a smaller P600 effect when it was easier to interpret the goal of the conveyed action given the properties of the used target object (e.g., moving an electric iron across a loaf of bread – the iron fits the requirements for the entity that can warm up or defrost the bread), than when the goals were less comprehensible7 (e.g., moving a dinner fork across a pair of pants on an ironing board). These results provide evidence that both in the verbal and non-verbal domains, a neurocognitive process integrating entities and actions based on goal-related requirements may be reflected by the P600. It is to be determined in future studies whether this processing is distinct from that reflected by the P600 previously reported in sentences with syntactic ambiguities (e.g., Osterhout et al., 1994) or errors (e.g., Osterhout et al., 1997). Some researchers have suggested that this syntactic P600 may reflect recovery of the sentence meaning based on both syntactic and semantic information (Friederici and Frisch, 2000; Kaan et al., 2000). For example, the syntactic processing may be a pre-requisite to determining the relationships between the described actions and entities, and difficulties in this integration, arising either from syntactic or object-action semantic incompatibility errors, may be reflected by the P600. In schizophrenia, the P600 effect has been found to be abnormally attenuated both to linguistically described incompatible entity-action combinations (Kuperberg et al., 2006c) and syntactic errors (Ruchsow et al., 2003; Kuperberg et al., 2006c). The present finding of the reduced P600 effect to video violations of goal-related requirements argues against the possibility that in schizophrenia the linguistic P600 is attenuated merely due to deficient syntactic processing.
It is interesting that the relationships between the disorganization and the N400 and between the negative symptom of goal-directed behavior deficits and the P600 were highly selective: the goal-directed behavior deficits were not correlated with the N400 priming, and the disorganization was not correlated with the P600 enhancement. This finding is consistent with previous studies that have documented a segregation between disorganization and negative behavioral symptoms in schizophrenia (Liddle, 1987; Keefe et al., 1992; Andreasen et al., 1995a; Andreasen et al., 1995b; Peralta and Cuesta, 1999; John et al., 2003). Thus, susceptibility mechanisms leading to these symptom types may be independent, and may be related to abnormalities within distinct conceptual processing streams.
Could it be argued that the ERP positivity effect in the present video paradigm is similar in nature to another positive-going waveform, the P300, which is generally evoked by ‘oddball’ stimuli (Donchin and Coles, 1988) and is commonly attenuated in schizophrenia (Ford et al., 1999)? We believe that this account is unlikely for several reasons. First, in healthy individuals, we have previously demonstrated that the P600 effect in videos is selectively evoked to violations of goal-related action requirements, rather than any type of task-relevant semantic anomaly (Sitnikova, 2003; Sitnikova et al., 2003; Sitnikova et al., 2008a; Sitnikova et al., 2008b). Second, in healthy individuals, we have shown that, unlike the P300 (Polich, 1986; Picton, 1992), the P600 effect evoked to video violations of goal-related action requirements is not modulated by the behavioral task performed by participants (e.g., passive vs. active -- Sitnikova, 2003; Sitnikova et al., 2003). Third, unlike the P300 that has previously been shown to non-specifically correlate with both negative and positive schizophrenia symptoms (Turetsky et al., 1998; Frodl-Bauch et al., 1999; Mathalon et al., 2000), the P600 effect in the present study did not correlate with any other symptoms besides the SANS impersistence at work or school.
Negative symptoms, such as goal-directed behavior deficits, are related to poor treatment success, posing a considerable burden to affected individuals and society at large (Velligan et al., 1997; Poole et al., 1999; Sharma and Antonova, 2003). The present results, linking these deficits to a specific abnormality in utilizing goal-related requirements of real-world actions, give some insights into the neurocognitive mechanism that may contribute to this poor outcome. Recent research in computational neuroscience suggests that the neurobiological mechanisms specific to the prefrontal cortex (that support updating of maintenance contingent on the presence of a reward) may mediate self-organization of discrete, rule-like representations coded by patterns of activity (distinct sets of units with high synaptic weights -- Rougier et al., 2005, rather than by changes in synaptic weights, which have been used to simulate graded connections within semantic memory networks -- Hutchison, 2003). These prefrontal mechanisms might also support self-organization of patterns of activity coding minimal goal-related requirements of real-world actions. Specifically, through breadth of learning experience with actions that achieved or failed to achieve their goal (i.e., either coupled or not with a “reward” signal), these prefrontal mechanisms may identify the patterns of activity that have been present across all instances of achieving specific goals (Sitnikova et al., 2008a; Sitnikova et al., 2008b). Abnormalities of the prefrontal function that are a cardinal feature of schizophrenia (Braver et al., 1999; Weinberger et al., 2001; Manoach, 2003; Barch, 2005; Sitnikova and Kuperberg, 2007) may prevent the very formation of memory representations coding goal-related action requirements. As a result, improving the prefrontal function in treatment may not be sufficient but might need to be complemented by cognitive remediation covering a broad range of real-world activities so as to form the conceptual memory representations required for successful goal-directed behavior.
It is noteworthy that our schizophrenia sample was medicated. Unfortunately, unless dose regimen for administering medications is experimentally determined, statistical methods are largely ineffective at evaluating contributions of this potentially confounding variable (Blanchard and Neale, 1992). Therefore, in future studies, it will be critical to confirm our findings in drug-free patients with schizophrenia.
To our knowledge, the current study is the first to provide evidence that a form of conceptual processing, distinct from activation and building up expectations within graded semantic memory networks, is impaired in schizophrenia. When comprehending real-world goal-directed activities, patients exhibited an under-recruitment of goal-related action requirements, as was indexed by the P600 ERP waveform, which tracked with deficits in goal-directed behavior in their lives. Our findings also suggest that during naturalistic real-world comprehension, schizophrenia patients were able to engage semantic memory networks, as was indexed by the N400 (and earlier N300) ERP waveforms. In fact, in patients with disorganization symptomatology, this process might even be hyperactive.
We thank David R. Hughes, Sonya Jairaj, Karin Blais, and Jordana Cotton for their assistance in preparing the materials and collecting the data. This research was supported by NARSAD (with the Sidney R. Baer, Jr. Foundation) grants to TS and GK, MGH Fund for Medical Discovery grant to TS, grants MH02034, MH071635, and MGH Claflin Distinguished Scholars Award to GK, and by the Institute for Mental Illness and Neuroscience Discovery (MIND).
1The question of how knowledge is represented in the human brain has been a matter of a long standing debate in cognitive psychology and neuroscience. According to a currently prevailing view, conceptual knowledge may be organized and neuroanatomically segregated according to the modality of information: visual, acoustic, motor, olfactory, abstract, etc (Chang, 1996; Thompson-Schill, 2003; Sitnikova et al., 2006). A wealth of behavioral, electrophysiological, and neuroimaging data suggests that words and visual images access similar (even though possibly non-identical) representations within this distributed conceptual knowledge system (e.g., Potter and Faulconer, 1975; Paivio, 1986; Ganis et al., 1996; Vandenberghe et al., 1996).
2The N400s evoked by words and pictures show similar sensitivity to the associative and categorical relationships. Therefore, it is believed that they reflect analogous underlying neurocognitive mechanisms (Ganis et al., 1996; McPherson and Holcomb, 1999; Federmeier and Kutas, 2001).
3In studies of visual image processing, the N400 is often preceded by a somewhat earlier but functionally similar ERP component, the N300, thought to reflect accessing image-specific graded representational networks (McPherson and Holcomb, 1999; Sitnikova et al., 2006) This N300/N400 complex is known to have more anterior scalp topography than the N400 to words, probably due to non-identical representations activated by these two types of stimuli (McPherson and Holcomb, 1999; West and Holcomb, 2002).
4In our previous work using video clips in healthy participants, we labeled this component evoked by violations of goal-related action requirements as a ‘Late Positivity’ rather than ‘P600’. However, in the literature on schizophrenia, the term ‘Late Positive Component’ is frequently used to refer to a positivity that has been interpreted as part of the P300 family. Therefore, to avoid a confusion with this use of the ‘Late Positive Component’ term, in this paper we label the positivity evoked by violations of goal-related action requirements in video clips as a ‘P600’.
5The P600s evoked by words and visual images show similar sensitivity to the mismatch between the goal-related action requirements and the properties of the engaged entities and similar insensitivity to the associative relationships. Therefore, it is likely that they reflect analogous underlying neurocognitive mechanisms (Sitnikova et al., 2008a; Sitnikova et al., 2008b).
6In fact, schizophrenia patients evoked an abnormally increased N300 effect to the less comprehensible incongruous (vs. congruous) video endings. Federmeier and Kutas (2001) suggested that the N300 may be modulated by visual-feature overlap between eliciting and expected pictures of objects. The incongruous target objects in our videos might have lacked visual-feature overlap with the expected target objects especially in the less comprehensible condition. Patients might have inappropriately attempted to map these target objects on image-specific graded representational networks.
7In healthy controls, the more comprehensible incongruous (vs. congruous) video endings evoked a larger N400 effect than the less comprehensible incongruous endings. This pattern of ERPs evoked by controls in our video paradigm is comparable to that previously evoked in a similar language comprehension paradigm in healthy participants (Kuperberg et al., 2007b). Verbs that were semantically unrelated to the preceding sentence context evoked a larger N400 and a smaller P600 when they described possible actions (e.g., ‘To make the dinner more romantic the hostess had shaved …), compared to when they described impossible actions (e.g., To make the dinner more romantic the table had shaved …). It is likely that during comprehension, processes of mapping the target stimuli on the graded semantic memory networks (reflected by the N400) and evaluating the target stimuli against the goal-related action requirements (reflected by the P600) are engaged in parallel (the P600 effect may not be visible in the ERPs recorded from the scalp during the N400 time-window due to a cancellation by the overlapping N400 effect of the opposite polarity, but see Bornkessel et al., 2002; Bornkessel et al., 2003 for evidence that the action-entity integration based on semantic requirements may evoke ERP positivities with the onset by 200 msec after the stimulus presentation). Therefore, it is possible that considerable difficulties in using the goal-related action requirements to integrate stimuli such as our less comprehensible videos might lead to discontinuation of the mapping on the semantic memory networks, resulting in smaller N400 effects.
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/journals/abn.