|Home | About | Journals | Submit | Contact Us | Français|
We overview the discovery, characterization, and evolving use of the N400, an event-related brain potential response linked to meaning processing. We describe the elicitation of N400s by an impressive range of stimulus types -- including written, spoken, and signed (pseudo)words, drawings, photos, and videos of faces, objects and actions, sounds, and mathematical symbols -- and outline the sensitivity of N400 amplitude (as its latency is remarkably constant) to linguistic and nonlinguistic manipulations. We emphasize the effectiveness of the N400 as a dependent variable for examining almost every aspect of language processing, and highlight its expanding use to probe semantic memory and to determine how the neurocognitive system dynamically and flexibly uses bottom-up and top-down information to make sense of the world. We conclude with different theories of the N400’s functional significance and offer an N400-inspired re-conceptualization of how meaning processing might unfold.
The first report of an N400 response was published 30 years ago, in 1980, by Kutas and Hillyard. Since its discovery, more than 1000 articles have been written using the N400 as a dependent measure, across a wide range of areas, including language processing, object, face, action, and gesture processing, mathematical cognition, semantic and recognition memory, and a variety of developmental and acquired disorders. Across this body of literature, much has been learned about the measure and, in tandem, about human cognitive and neural functioning. Our goal in this piece is to recount the N400’s history, summarizing what we have learned about and from this electrophysiological measure and taking the opportunity to reflect on how discoveries are made, how neurocognitive measures are characterized, and how subfields of scientific inquiry are born and mature.
Soon after the discovery, in the mid-1950’s, that it was possible via averaging to extract a time series of changes in electrical brain activity recorded at the human scalp before, during, and after an event of interest, it was demonstrated that measurable parameters of these evoked potentials – their amplitudes, latencies, and scalp topographies – systematically varied with stimulus or response features (e.g., pitch, color, intensity). Within five years, the field of cognitive electrophysiology was born from various demonstrations that scalp ERP waveforms indexed not only objective stimulus characteristics (often within the first 200 ms), but also endogenous influences related to people’s reactions or attitudes to the stimuli and experimenters’ instructions (between ~200–1500 ms post-stimulus onset). By 1978, cognitive electrophysiologists had identified ERP markers of stimulus evaluation processes distinct from response preparation and execution. In particular, the P300 (P3b) is an endogenous, mostly modality-independent response observed over central-parietal scalp locations whose latency (300–800 ms) varies systematically with the duration of stimulus categorization. P3b amplitudes are inversely correlated with the eliciting item’s subjective probability of occurrence: the less probable an event, the larger the P3b elicited (reviewed in Hillyard & Kutas 1983).
Against this backdrop, to investigate the role of sentence context on word recognition during reading, Kutas and Hillyard (1980) modified the oddball paradigm, known to elicit large P3b’s, for language materials. Undergraduates read 7-word-sentences presented one word per second; 75% were congruent control sentences (e.g., I shaved off my mustache and beard.), while a random 25% ended “oddly” with an improbable word (Expt. 1: He planted string beans in his car.) or a wholly anomalous one (Expt. 2: I take my coffee with cream and dog.). Surprisingly, although the manipulation modulated the ERP, it did not yield a P3b, but rather a large negativity with a broad (parietally maximal) scalp distribution, peaking around 400 ms (largest for semantic anomalies, but also present for improbable but sensible endings); it was called the N400.
The N400 was labeled as such because it was a relative negativity peaking around 400 ms. More precisely, it is negative-going at particular scalp locations relative to a specific reference derivation (e.g., posterior sites relative to recordings behind the ears), relative to a 100 ms pre-stimulus baseline. Indeed, the N400 to an unexpected item need not be negative in absolute terms. It is thus typically examined in cross-condition comparisons and routinely instantiated in a difference ERP created via a point-by-point subtraction of, e.g., a congruent ERP from an incongruent one. This difference – or N400 effect -- is a monophasic negativity between 200 and 600 ms, largest over centro-parietal sites, with a slightly right hemisphere bias (at least for written words in sentences).
Although some ERP responses are named for their presumed function, the N400 is not, and its functional characterization (like that of all cognitive measures) is in a continual state of fine-tuning. Its identity is some function of its morphology, timing, and behavior under certain experimental manipulations. Some electrophysiologists have argued for a precise neuroanatomical characterization, but that is not so straightforward in practice, especially given that the same functional operations may be carried out in different neuroanatomical substrates. Accordingly, we do not view “the N400” as an undifferentiable, localizable (or lesoniable) neural entity that indexes one particular mental operation. Instead, we use the term N400 as a heuristic label for stimulus-related brain activity in the 200–600 ms post-stimulus-onset window with a characteristic morphology and, critically, a pattern of sensitivity to experimental variables – and hence a common functionality; next, we discuss what these variables are and how they impact human brain activity as well as our comprehension of sensory input.
Kutas’ mentor Donchin (1984) emphasized the importance of carefully characterizing ERP responses in terms of their functional sensitivity as a prerequisite to using them as markers of specific aspects of information processing. Accordingly, early years of work (reviewed in Hillyard & Kutas 1983; Kutas & Van Petten 1988) focused on determining what range of manipulations the N400 was and was not sensitive to (and how), and its relation to behavioral measures and other known ERP responses. Using the anomalous sentence paradigm as a starting point, the field asked whether N400 effects would obtain for any unexpected manipulation using words. The answer was a resounding no: there was no observable N400 effect to the final word of “I shaved off my mustache and BEARD” (a congruent but physically unexpected ending) compared to the physically expected congruent ending (beard). Likewise, the N400 also did not obtain in response to just any language-related violation: there were no N400 effects to simple grammatical (morphosyntactic) violations as in “All turtles have four leg” (versus legs). Other studies confirmed that the ERPs to all sentence final words – even congruent ones – are characterized by some degree of N400 activity, and further demonstrated that N400 amplitude is highly correlated (r = .9) with an offline measure of the eliciting word’s expectancy – i.e., cloze probability, the percentage of individuals who would continue a sentence fragment with that word. This sensitivity to cloze probability obtains irrespective of contextual constraint (negating a core prediction of any view that the N400 indexes inhibition, as in Debruille 2007): N400s to the sensible and equally low cloze probability completions of strongly constraining sentences (e.g., The bill was due at the end of the hour.), and of weakly constraining sentences (He was soothed by the gentle wind.) were statistically indistinguishable, and much larger than those to high cloze probability endings (The bill was due at the end of the month.).
Indeed, many laboratories demonstrated that semantic anomalies were neither necessary nor sufficient for N400 elicitation and that N400s did not always pattern with RTs. Fischler and colleagues (1983), for example, found that N400 amplitudes to the final words in affirmative (e.g., A robin is a bird/vehicle) and negative (A robin is not a bird/vehicle) sentences were determined exclusively by the relationship between the first and the second noun, and not by sentence meaning or truth value (which did, however, affect verification times). Numerous other studies further demonstrated that N400 elicitation did not require a sentential frame; for example, N400 effects obtained when the fifth item of a list mismatched rather than matched the prior four in semantic category membership. N400 effects also were observed in lexical priming paradigms, where a target word was or was not somehow related (e.g., identically, associatively, semantically, categorically, and perhaps phonologically) to an immediately preceding (prime) word; in all cases, related items showed reduced N400 amplitudes relative to unrelated items across a number of different tasks (reviewed in Kutas & Van Petten 1988).
Moreover, N400 effects were found to generalize across input modalities, including spoken words and American Sign Language signs and even language-like nonwords, i.e., pseudowords (reviewed in Kutas & Van Petten 1994). Importantly for functional characterization, N400 effects (albeit with varying amplitude distributions – topographies -- across the scalp) also were routinely seen to line drawings, pictures, and faces when primed (versus not) by single items or sentence contexts. However, N400-like activity was not observed in response to unexpected events in other structured domains such as music, be they the frequency of a note violating a musical scale sequence or a familiar melody; instead, such deviations elicited P3b-like potentials (Besson & Macar 1987). Clearly, the N400 is not a simple signature of the violation of any arbitrary or over-learned pattern. Overall, the early data suggested that the N400 indexed something fundamental about the processing of meaning and hinted that the meaningful/nonmeaningful dimension may be more important than the linguistic/nonlinguistic dimension.
We end this section by noting that although ERP parameters are sensitive to psychological variables they are neither generally nor readily reducible to psychological constructs. Ultimately it is the brain’s “view” of cognitive processing that we seek to characterize. ERPs provide a particularly apt inroad to this, by being a direct measure of neocortical activity that tracks brain states continuously and instantaneously. The N400’s relationship to other measures (e.g., RTs) whose functional sensitivities had been better mapped out was important as a starting point, but, ultimately, where ERPs are more direct, selective, and sensitive, the direction of influence must shift and, correspondingly, the field must be willing to rethink the pool of available cognitive constructs it has developed, largely from end-state measures. As we will show, in some cases the questions we ask about cognitive processing via the N400 have relatively clear answers; in other cases, however, our failures to converge upon an answer after intensive investigation raises the distinct possibility that the question itself may be ill-posed – either insufficiently clear or based on faulty assumptions.
As the N400 came to be characterized, it quickly became clear that it was the amplitude of the response that was most susceptible to manipulation (becoming smaller when factors rendered information more expected and thus easier to process) and most likely to vary with many of the same factors that influence RT measures (see overview in Kutas & Federmeier 2009). At a physiological level, amplitude reductions might reflect smaller post-synaptic potentials in the same neurons, activation of fewer neurons in a population, and/or less temporal synchrony among the generating neurons. N400 latency, by contrast, was generally quite stable - a fact whose theoretical significance we are just beginning to appreciate (see theory section). Characterizing N400 topography across the scalp has proven more difficult because a stable distribution was seen to visual words across manipulations, but temporally and functionally similar responses to other stimulus types had overlapping, but dissociable, scalp topographies. Whether this means there is more than one “N400” is difficult to answer based on surface potentials alone because of possible temporal overlap with other responses. At a deeper level, the answer depends on what is meant by different, and, as such, is theoretically laden. Early on, when the field was dominated by the information processing framework, assuming seriality and modularity of processes, topographic differences were often regarded as suggestive of different processors. However, as the field moved toward more distributed and interactive views, distributional differences were likely to be treated in a graded rather than categorical fashion.
It is also worth noting that in the early days there were heated arguments over whether the N400 was, for example, simply a longer-latency member of the N200 family of responses (typically preceding P3b’s to unexpected events) (e.g., Deacon et al 1991) or resolution of yet another component. Linking a newer response with a better-characterized one is useful in allowing new predictions and generalizations. Critically, however, such classifications often matter little for how the measure can be used. What is essential whether linking a newly-discovered response with an older one or splitting a well-studied response into subcomponents is that the measure be reliably identifiable in data and its sensitivity to stimulus and task properties mapped out; only then can it be used to meaningfully answer questions about cognitive and neural function. By 1988, the N400 had reached the status of a fairly well-characterized neural marker (“component”) and began to be usefully applied to a variety of questions, including those that were difficult or impossible to test with other metrics, and to challenge core assumptions about cognitive processing, especially in the domain of language (Kutas & Van Petten 1988; Kutas & Van Petten 1994).
Despite evidence to the contrary, the N400 has long been thought of as a “language” measure. As we hope to make clear, it is much more than that – however, it is true that the N400 is a particularly powerful tool for studying language. The N400 opened the door to investigation of the neural bases of language comprehension in the normal population, not just individuals with aphasia. By neural basis we mean characterization of the neural representations and functions supporting language processing not localization of those functions. Interest in neural basis aside, the observed systematicity between sensory, conceptual, and linguistic factors and properties of the N400 (together with those of ERPs more generally) have made it amenable for addressing core psycholinguistic questions that had proven intractable to most other dependent measures.
A case in point is time. Its critical importance for language processing was never questioned; however, the timing of events – at a granularity relevant for psycholinguistics, let alone neurobiology – was very difficult to measure except via unnatural, disruptive probes. RTs are, by their nature, end state measures, unable to track moment by moment processing. Indeed, even online measures, such as self-paced reading and eye tracking, often showed reliable context effects only after several hundreds of milliseconds had passed by, in the “spill-over region”, one or more words beyond the word of interest. Psycholinguists thus devised elaborate paradigms, based on complex sets of assumptions, in an attempt to get at temporal aspects of language processing with RT measures; these included presenting stimuli for short durations or masking them to curtail processing, using relatively short SOAs to disclose “early” influences of one word on another (again based on the idea that processing of a stimulus is somehow arrested when a new stimulus is encountered, although see Van Petten 1993), or using response deadlines. Indeed, these procedures have become so much the norm that ERP studies are sometimes criticized for drawing conclusions about timing without having resorted to any of them, even though time is an intrinsic property of the ERP. With ERPs, language manipulations can be tracked through time with millisecond resolution and without elaborate task conditions or any added task at all.
One of the first major contributions of the N400 to psycholinguistic research was therefore to show that effects of semantic manipulations could be seen almost immediately: the N400 congruity effect began ~200 ms (and peaked before 400 ms) into the processing of a critical word – written, spoken or signed. Furthermore, because ERPs can be examined in response to every stimulus, not just selected targets, they provide an instantaneous and continuous look at language processing. The ERP to every word read one at a time in the center of the screen (RSVP) contains N400 activity, which is affected by context, thereby revealing the inherently incremental nature of language processes. Comparing normal English sentences with those that were syntactically structured but semantically anomalous revealed a linear decline in N400 amplitudes of open class words across the course of a congruent sentence (i.e., word position effect), which thus seemed to reflect the incremental build-up of semantic (and not syntactic) constraints (reviewed in Van Petten 1993).
In addition to providing exquisite timing information, ERP measures provide a means of assessing the qualitative similarity of two (or more) effects -- e.g., at different processing levels. N400 studies offered critical evidence for both temporal and qualitative similarity between the effects of a word prime and those of a sentence context on word processing. Studies showed that the morphology, timing, and topography of the visual N400 semantic priming effect for target words following semantically related versus unrelated primes were indistinguishable from those for the visual N400 effect to sentence final words of congruent versus anomalous sentences. This was an especially important finding because, on most accounts, word level priming was thought to be mediated by automatic spreading activation or at least by some process internal to the mental lexicon, impervious to any external context. Sentence context and other top-down effects, by contrast, were assumed to exert their influence via qualitatively different controlled processes, acting upon the representation of already “recognized” words – that is, post-lexically. Direct, within-subject comparisons of lexical and sentential N400 context effects, however, strongly argued against the possibility that they arose from different stages of language processing (Kutas 1993).
Indeed, N400 data provided clear evidence for interactions of sentence context effects with word frequency, word level associations, and word repetition and revealed that, when both word- and sentence-level information sources were available, higher-level context effects tended to override lower-level ones, contrary to the then-prevailing assumptions of bottom-up priority for and insularity of word level processing (reviewed in Kutas 1993; Van Petten 1993). For example, the semantic constraints provided by a sentential context can supercede lexical frequency effects on the N400. Furthermore, when compared directly within the same materials, independent and qualitatively similar effects were obtained for lexical associative priming and sentence-level congruity: N400 amplitudes were reduced to lexically-associated second words in anomalous sentences as well as to unassociated words in congruent sentences, demonstrating the build-up of message-level meaning information over and above word-level associations. When both information sources were present, their influences were additive (with later work showing that even stronger message-level constraints override lexical association: Coulson et al 2005). These findings helped to establish that semantic congruity, repetition, and word frequency converge to influence a common stage of word processing: the modularity of lexical processing was irreparably penetrated by incisive N400 results.
Over the course of determining the functional specificity of the N400, it became clear that whereas some types of language manipulations altered the amplitude of the N400, others, including syntactic violations, were associated with different types of ERP effects, such as a later positivity called the P600 or a temporally coincident negativity with a (sometimes left) frontal focus called the left anterior negativity (LAN). A few studies cleverly took advantage of the fact that grammatical violations can have semantic consequences to study syntactic aspects of language with the N400 (reviewed in Kutas & Kluender 1994; Kutas & Van Petten 1994). More commonly, however, this functional dissociation was used to determine how the brain classifies aspects of language about which (psycho)linguists were less certain – e.g., the agreement in gender between a pronoun (her) and its antecedent (the boy), which on some accounts could be syntactic (constraints imposed by one’s grammar) or, by other accounts, semantic (part of each word’s meaning and how they are used in discourse). The ERP results were clear: agreement violations did not modulate N400 activity, but elicited a P600 (and sometimes LAN) instead, suggesting the brain treated them as syntactic rather than lexico-semantic in nature (Osterhout & Mobley 1995). Similarly, Japanese researchers used the presence of large N400s to argue for a semantic over a morpho-syntactic account of the links between nouns and their classifiers (quantifiers that agree with the type of entity being counted, Sakai et al 2006).
The fact that N400s could be observed for both visual and auditory words afforded cross-modality comparisons that were relatively less tractable for RT studies. The functional similarity of the N400 generating process in the two modalities (e.g., sensitivity to semantic relations, sentential congruity, and related anomaly effects – N400 reductions to anomalous words semantically related to the highest cloze endings for a sentence frame) was theoretically important for psycholinguistics, making the occasional differences and interactions all the more notable. Auditory N400s tended to begin earlier (although not when speech was presented at a fixed rate, as in the visual modality, rather than naturally), last longer, and have a slightly more frontal and less right-biased topography (reviewed in Kutas & Van Petten 1994). Finding different patterns of N400 effects (across SOA) when the prime was an auditory word and the target a visual word or vice versa, Holcomb and Anderson (1993) argued for an amodal semantic system tapped by modality specific processes. Importantly, however, the modality general aspects of the N400 effects made it easier to investigate language representations and process both independent of and as a function of input type, and helped validate the non-ecological RSVP procedure used for reading studies (to avoid contamination of visual ERP with eye movements).
The N400’s broad sensitivity to meaningful stimuli and semantic manipulations meant that it could be used to ask questions about how meaning-related information is stored in the brain, in what is often called semantic memory (reviewed in Kutas & Van Petten 1994; Kutas & Federmeier 2000). These included studies of typicality and level of representation, of concreteness, and also of word type differences (e.g., nouns and verbs: Gomes et al 1997). At the most general level, two important findings cut across these studies. One was dissociation between RT and N400 measures, which only sometimes behaved similarly. Such dissociations are common to cognitive ERP measures, given that individual components reflect only a subset of the processes that contribute to RTs – and are, in fact, a useful dependent measure specifically for that reason. The other, perhaps more important, finding was that N400 data often did not fully support any of the available theories (even if ERP authors sometimes were compelled to choose a position), suggesting instead that aspects of each were correct. Consider, for example, the N400 work aimed at unearthing why people find concrete words easier to process than abstract words. On a “dual coding” account, concrete materials accrue an advantage by virtue of being represented in two semantic systems – one verbal, in the left hemisphere (LH), which also represents abstract words, and another, image-based, in the RH. By contrast, “context availability theory” grounds concreteness effects in quantitative differences in semantic richness within a common, amodal semantic system. Across a number of studies, N400 results have been systematically mixed: responses to concrete (versus abstract) words manifest different scalp topography, consistent with the dual coding view. However, such differences are reduced for words completing congruent sentences (Holcomb et al 1999), an interaction that is consistent with the context-availability view, albeit not in the precisely predicted form. The N400 data thus call for a theory that combines elements of both accounts.
More specifically, one early line of N400 work tested long-standing questions about whether semantic memory is best thought of as a single, amodal system (similarly accessed by stimuli with different surface forms, such as pictures and words) or as comprising a number of distinct (sub)systems (reviewed in Kutas & Federmier 2000). The answer provided by N400 data – namely, yes and yes -- raised serious questions about the validity of this debate. For, although there were broad similarities (in waveshape, timecourse, and functional sensitivity of the N400 effect), there were also important differences (especially in terms of scalp topography) in the response to different types of meaningful items. Although much of the debate focused on comparisons of pictures and words, a wide range of stimulus types have been investigated, including faces, environmental sounds, and even odors. Pictures elicited a similar but more frontally-distributed N400, similar to that for concrete words (Ganis, Kutas & Sereno 1996). The N400 effects for familiar faces completed by mismatching versus matching internal features had an occipital maximum (Olivares et al 1999). Within-subject comparisons of words and meaningful environmental sounds demonstrated N400 effects for each, similar in all but hemispheric laterality, right dominant for words and left dominant for environmental sounds (Van Petten & Rheinfelder 1995). The weight of these studies pointed to a functional entity that varies systematically with relatedness within and across a wide range of sensory input types, but characterized by topographic differences that implicate an assortment of at least partially non-overlapping neural areas in meaning construction. N400s thus are modality-dependent but not modality-specific (perhaps marking a unimodal to amodal interface; see section on theory) – an electrophysiological marker of processing in a distributed semantic memory system.
With the growing sense of the N400 as an index of semantic memory and the dynamic processing that unfolds within it, it became of interest to localize the source(s) of this activity, to answer the question of not only what brain areas might be involved but when and how they might contribute (reviewed in Van Petten & Luka 2006; Lau et al. 2008). Some attempts have been made to model the scalp-recorded signal, but more exciting developments involved the use of intracranial recording techniques, typically in individuals with epilepsy prior to surgical intervention. Such studies have identified a widely distributed set of brain regions co-active with the scalp-recorded N400. These include a source in the anterior medial temporal lobe (functionally akin to the scalp N400 with respect to manipulations of semantic priming, semantic congruity, repetition, and recognition (verbal) memory), in middle and superior temporal areas, inferior temporal areas, and prefrontal areas. Essentially these same brain areas in both hemispheres (although perhaps stronger in the LH) also have been identified by other neuroimaging techniques such as the magnetic counterpart of the ERP, the magnetoencephalogram (MEG), and the event related optical signal (Tse et al 2007); both have implicated the superior/middle temporal gyrus, temporal-parietal junction, and medial temporal lobe, and, with less consistency, the dorsolateral frontal cortex. These brain areas, as we know today, align with the distributed network presumed integral to semantic memory storage and processes, as independently identified by hemodynamic imaging and neuropsychological studies.
Naturally, there are clear benefits to localizing the neural generators of any scalp ERP activity, although more so when there is clear and independent evidence for the generating area’s function. As already noted, ERP components generally, and the N400 in particular, have been used successfully to answer cognitive questions even as their neural source remained obscure. Still, it is useful to now know that various neuroimaging data all point to a multimodal semantic system, and that MEG activity coincident with the scalp N400 suggests that it does not reflect activity in a single, static source but rather a wave of activity starting (~250 ms) in the posterior half of the left superior temporal gyrus, spreading first forward and ventrally to the left temporal lobe by 365 ms, and thereafter, between 370–500 ms, to the right anterior temporal lobe and to both frontal lobes. With such data in hand, the question becomes not where is the N400 generator localized, or whether there are multiple N400s, but rather what are the functions of the dynamic neural system of which scalp N400s are reflections?
The fact that one of the first variables found to clearly modulate N400 amplitude was repetition meant that the N400 could also be used to study aspects of recognition memory (reviewed in Friedman & Johnson 2000). N400 patterns in recognition tasks were similar to those seen for repetitions, with correctly identified old words eliciting less negativity in the N400 time window than correctly rejected new words. Dissociations of N400 memory-related effects from those on later components (such as the Late Positive Complex (LPC)) then helped to provide key support for multiple process/systems views of memory. Smith and Guster (1993), for example, provided early evidence for a dissociation between N400 memory effects and those due to recollection by showing that the magnitude of the N400 repetition effect was similar for memory judgments that entailed recollection or only a feeling of familiarity (whereas LPC modulations were yoked to recollection). The N400 also played a role in emerging studies using back-sorting paradigms to examine encoding-related brain activity that predicted later memory performance (a procedure later adopted by fMRI researchers, e.g., Jordan et al 1995). This serves as yet another example of the power of electrophysiology to get at aspects of cognition largely impenetrable with behavioral measures.
An important early – and now long-standing – debate concerned the role of attention in the elicitation of N400 effects. The prevalent view at the time divided cognitive processes into those that required attention (so-called “controlled” processes) and those that proceeded without attention or awareness (so-called “automatic” processes). Of interest, in the context of this debate, was whether the N400, and the aspects of semantic processing it seemed to index, are controlled or automatic in nature. In the domain of language, the answer to this question would help classify the N400 as a “prelexical” or “postlexical” process, happening either before or after the “magic moment” of word recognition. At a theoretical level, a lot might hinge on the answer to this question, as the sensitivity of the N400 to sentence-level context information would pose problems for certain theories if the N400 turned out to be prelexical, calling into question claims about the priority, modularity, and insularity of word level analyses. On the other hand, as it happens, very few of the interpretations gleaned from studies using the N400 as a dependent measure would be substantively affected by the outcome of this debate. The automatic-controlled distinction aside, the data collected in an attempt to resolve this debate address the more general question of whether word meanings are invariably activated – irrespective of perceptibility, attention, awareness, task relevance, and other resource demands – and, in either case, what variables determine what meaning information is activated, to what extent, and for how long (reviewed in Deacon & Shelley-Tremblay 2000). A corollary of this debate in the N400 memory literature centered on whether N400 modulations were best thought of as reflecting implicit or explicit aspects of memory. Initial studies finding that N400 context effects were sensitive to task demands (Chwilla et al 1995), selective attention and pattern masking led to the view that the N400 was a controlled process and, therefore, on many accounts, post-lexical; this was the basis for the contextual integration view of the N400. Subsequent N400 data, however, called for a more nuanced answer to this dichotomous choice.
Large N400 sentence context effects are elicited when a participant’s only task is to read or listen, confirming the intuition that semantic processing is what humans naturally do with language (i.e., the default). Moreover, similar magnitude effects were observed even when some secondary task – semantic, phonological, or graphemic in nature (Connolly et al 1990) – was imposed. When single words instead of sentences served as the prime, task demands had more of an effect. Although typically larger when instructions explicitly called for semantic analyses, reliable N400 effects (but not necessarily concomitant RT priming) were nonetheless observed in situations where semantic processing was not necessary nor even beneficial. Moreover, N400 amplitude modulations were clearly seen with experimental manipulations aimed at minimizing controlled processes (e.g. of stimulus onset asynchrony, proportion of related stimuli, level of processing: reviewed in Holcomb 1988). Importantly, then, N400 measures revealed ongoing semantic processing even when such analysis was orthogonal to task performance and not evidenced in overt behaviors (e.g., Kuper & Heil 2009). Still, in all these types of studies, participants directed their attention to the stimuli (if not the semantic level of analysis), and that seemed to be important for N400 elicitation. This possibility was examined in a number of studies crossing priming with selective attention (reviewed in Deacon & Shelley-Tremblay 2000).
In a selective attention paradigm, participants are asked to respond to target items infrequently embedded in a stream of non-target items in an attended channel as they ignore all the items in an unattended channel. Non-target stimuli also may vary on other dimensions -- e.g., contain meaning relations between consecutive items -- and the question is the extent to which semantic processing is impacted by where attention is directed. Initial studies showed that attentional selection either eliminates (for selection based on visual location) or severely attenuates (for selection based on color) visual and auditory N400 priming effects, perhaps more effectively in the visual modality. McCarthy and Nobre (1993) observed semantic and identity priming effects on the N400 only for words appearing in the attended spatial location, inconsistent with an automatic N400. Subsequent studies factorially combined semantic priming with attentional status to create four conditions: both prime and target attended, both unattended, or either the prime or the target attended while the other is not. N400 repetition and semantic priming effects were most consistently observed for targets whether or not they were attended (albeit much attenuated for unattended targets) as long as the primes were attended. Observations of semantic priming on target N400s by unattended primes are more variable, but have on occasion revealed some level of semantic processing, which importantly can be dissociated from behavioral priming and recognition effects.
A parallel literature has examined the N400 automaticity issue by combining a priming paradigm with visual masking of either the prime or the target. Visual pattern masking results when a visual pattern is flashed at the same spatial location before or after a brief display of the item of interest and has the consequence of reducing that item’s conscious perceptibility (i.e., reportability). Overall, masking of primes has been found to attenuate but not to eliminate N400 effects, although Holcomb and colleagues convincingly attribute the residual to occasional prime visibility under masking (Holcomb & Grainger, 2009). A similar account, however, cannot explain the presence of the small but reliable N400 effect to semantically primed versus unprimed masked targets that participants could neither categorize much above chance nor report (Stenberg et al. 2000).
Other evidence for the presence of N400 semantic priming effects under conditions of reduced awareness comes from the literature on the attentional blink (AB) phenomenon (reviewed in Vogel et al 1998). The attentional blink is a short refractory period occurring about 300–600 ms after the detection of a target item (T1) in a stream of rapidly presented stimuli, during which subsequent targets (T2) are missed. ERP data have bounded the locus of T1-T2 interference as after initial perceptual processing of T2 (as reflected in normal early sensory visual potentials) but prior to the encoding of T2 into working memory (as reflected in elimination of the P3b component). Critically, when semantically related or unrelated word pairs are embedded in the stimulus stream, N400 effects are observed whether T2 is a target (Vogel et al 1998) or a prime (Rolke et al 2001) in the critical AB window. These results from the AB paradigm reveal that words that are attended and perceived – but not identified and encoded into working memory -- can nonetheless have undergone some semantic analysis and, moreover, can influence the semantic analysis of upcoming items. Findings of N400 semantic priming effects during sleep (reviewed in Ibanez et al 2008) and N400 memory effects in amnesia (Olichney et al 2000) further support this view.
The N400 thus could not be neatly mapped into the automatic or controlled category, having characteristics associated with each (being importantly modulated by selective attention, and thus not fully automatic, but not requiring the kind of awareness important for controlled processing). This failure in the face of a vast empirical base strongly suggests problems with the initial framing of the debate.
Once the N400 was characterized as a measure – and in parallel with its use to begin to answer basic questions in language and memory – its power for studying special populations was recognized. In particular, the N400’s functional specificity offered an inroad to questions about the nature of certain deficits, at least under the right conditions (i.e., using well-tested paradigms that replicate in controls). Moreover, the N400 offered a critical opportunity to measure processing in groups with more limited abilities to meet the demands of typical cognitive tasks. Although a thorough review of the findings from studies using the N400 with special populations is beyond the scope of this piece, it is important to note that the N400 has now been used in the study of many different conditions, including Alzheimer’s disease, aphasia, autism, cerebral palsy (as a means of measuring vocabulary size), closed head injury, dyslexia (and other developmental language disabilities), epilepsy, mood disorders, Parkinson’s disease, psychopathy, and schizophrenia (reviewed in Munte et al 2000; Kuperberg in press). More generally, the N400, often in conjunction with neuropsychological measures, has been used to measure individual differences in language and memory functions in the general population, across the lifespan.
As the field matured, N400 data not only helped to answer classic, often subdiscipline-specific questions, but to raise new ones – about the validity of long-standing theoretical dichotomies, the reality of certain core cognitive and linguistic constructs, and, indeed, the separability of the sub-disciplines themselves. Accordingly, many studies in the literature of this time are difficult to neatly categorize into broad domains, as above. Instead, N400 work highlighted the complex interactions between – and inherent neural inseparability of – perception, attention, memory, language, and meaning.
Expanding on work comparing effects in word pairs and sentences, researchers began to look at the comprehension of multi-sentence texts (reviewed in Van Berkum 2009). They found that N400 amplitudes were sensitive to discourse in the same manner and with the same timecourse as to word- and (isolated) sentence-level constraints, and that the pattern of prevalence of higher over lower levels of analysis extended to discourse (e.g., discourse constraints reversing N400 repetition effects: Camblin et al 2007). For example, simply adding an informative title to a locally coherent but globally opaque text passage sufficed to reduce the amplitude of the average N400 to all the content words in the passage. Further work showed that N400s to words in identical, comprehensible sentences were smaller when they were consistent with the discourse context than when they were not. Thus, discourse effects unfold very rapidly – in the case of a spoken word, even before its acoustic realization ends. Given the similarity of these N400 discourse effects to other linguistic and even non-linguistic N400 effects, it would seem unnecessary to resort to any language-specific model to account for them.
Indeed, in many cases, discourse effects would seem to draw heavily on comprehenders’ world knowledge. On some accounts, this type of knowledge is taken to be distinct from facts about words and their meanings and thus should be processed differently – e.g., lexico-semantic knowledge integrated prior to world knowledge and pragmatics. N400 data unequivocally show that this type of account is not viable (Hagoort et al 2004): In the context of a sentence such as Dutch trains are ____ and very crowded, there is no measurable difference between the N400 to “sour,” which clearly violates semantic constraints, and that to “white,” which does not, but which is at odds with the fact of “yellow” Dutch trains (with smallest N400s); similar sensitivity to real-world script knowledge can be seen even in word priming (Chwilla & Kolk 2005). Voice-based inferences about who a speaker is (e.g., probable age, gender, and/or social status) and thus what they are likely to know, believe, or say also modulate N400 activity. Furthermore, what knowledge is used and how is quite dynamic and flexible. As reviewed in Van Berkum (2009), a pragmatic anomaly out of context (e.g., peanuts falling in love), associated with large N400 amplitude, can be eliminated by a context that identifies the situation as fictional. Similarly, the insensitivity of the processes indexed by the N400 to negation (discussed above) is ameliorated when negation is pragmatically licensed (i.e., is being used to reasonable purpose), with equally reduced N400 amplitudes to sentence final words in “With proper equipment, scuba diving is safe/isn’t dangerous. N400 data thus attest to the incredible power of the language system to rapidly access, integrate, and adapt to word, sentence, and discourse information, along with world knowledge and common ground.
In addition to compelling the retrieval of facts, language supports abstraction and flights of fancy, not infrequently expressed via figurative language. Studies of nonliteral language processing have capitalized on the temporal precision of ERPs to test hypotheses about when and how (from what information) nonliteral meaning is constructed, and whether the processing of literal and nonliteral language differ in quantitative and/or qualitative ways. Irony and sarcasm have received some inquiry, but the primary focus has been on joke and metaphor comprehension (reviewed by Coulson in press). Overall, the processes indexed by the N400 seem to unfold in a similar manner with the same timing for figurative as for literal language, contra the view that meaning processing for nonliteral language is inherently slower. For instance, whereas N400s are larger for metaphors (He knows that power is a strong intoxicant) than for non-metaphorical controls (He knows that whisky is a strong intoxicant), they are of intermediate amplitude for literal sentences that require explicit mappings between objects and the domains in which they commonly occur (He used cough syrup as an intoxicant). N400 effects thus suggest quantitative rather than qualitative differences between literal and figurative language, with metaphors often taxing mapping and conceptual integration processes more than literal sentences. At the same time, however, some aspects of joke comprehension seem easier for the RH than LH, as inferred from N400 attenuations to probe words following one-line jokes.
Although the N400 is broadly sensitive to factors related to semantic fit, some surprises have been encountered, especially the finding that thematic role (verb argument) violations, which a priori were thought to be semantic in nature and certainly have strong semantic implications, did not necessarily modulate N400 amplitudes (reviewed in Kuperberg 2007). For example, N400s were not different to “eat” in “For breakfast the boys would only eat…” than in the thematically incongruent “For breakfast the eggs would only eat…” Instead, this comparison yielded a P600-like positivity. Importantly, such findings, like those for negation, constraint, and related anomalies, serve to emphasize that the N400 is not simply an index of semantic plausibility. Instead, it seems clear that plausibility judgments are some function of a number of processes that differ in the time course of their availability -- and are usually evaluated as a non-speeded, end-state response. In contrast, the N400 occupies a temporally delimited place within an incremental system (see discussion in Federmeier & Laszlo 2009). Thus, in some cases (e.g., negation in the absence of pragmatic licensing), information that ultimately impacts plausibility judgments is not active in time to facilitate N400 activity. In other cases, information associated with implausible stimuli has, at least temporarily, been activated – e.g., because incoming words share features with contextually-induced predictions or are related to or associated with other words or concepts in the sentence or discourse -- creating what Kuperberg (2007) calls a “temporary semantic illusion”. Notably, in this context, it is important to distinguish between a lack of an N400 effect and a lack of an N400 component, because failure to find an N400 difference across conditions (because both are facilitated relative to a wholly unexpected “baseline” condition or because neither is) cannot be used to conclude that the operations indexed by the N400 have been “suspended” or “blocked” (Bornkessel-Schlesewsky & Schlesewsky 2008; Kolk & Chwilla 2007). In these cases there are N400 responses, just not differential levels of activity in the semantic system across two inputs that ultimately yield different plausibility judgments. Similar considerations apply to the literature employing “double violation” paradigms (in which a given word violates constraints at multiple levels of analysis) to ask how different aspects of language (especially semantic and syntactic ones, but also prosodic) interact (Gunter et al. 2000; Hagoort 2003). These studies have revealed complex interactions between meaning and form based analyses, but, more generally, serve to highlight the utility of the multidimensional nature of the ERP signal, which allows different types of language-related effects to be examined in parallel, on the same word, without obfuscating their separate influences.
With accumulating data attesting to the prevalent role of sentence and discourse context information in shaping language comprehension, a key question for the field became when and how context affects the processing of an incoming word. Answers to these questions would help adjudicate between bottom-up processing models and interactive ones, in which top-down and bottom-up information are assumed to be processed in parallel and in a mutually constraining manner. ERP work in this time period provided some of the earliest and most powerful evidence showing that context shapes word processing from its earliest stages. Indeed, it became clear that, at least for young adult comprehenders, context information actually serves to pre-activate features of likely upcoming words, such that the processing of unexpected stimuli that share semantic (Kutas & Federmeier 2000) or even orthographic (Laszlo & Federmeier 2009) features with predicted items is facilitated. Strong evidence for predictive processing in language came from ERP studies that examined responses to words preceding a predicted target – for example, function words or adjectives that, while perfectly compatible with the accrued context information, matched or mismatched in gender (e.g., van Berkum et al 2005; Wicha et al 2003) or form (e.g., English “a” versus “an”: DeLong et al. 2005) with a predicted (but not yet presented) upcoming word (e.g., “On windy days, the boy liked to go outside and fly a/an … [where kite is predicted]). Because, in the absence of prediction, these modifying words constitute equally good fits to the accrued contextual information, N400 reductions when the words matched as opposed to mismatched the predicted target showed clearly that information about likely upcoming words has shaped the system in advance. Furthermore, at least in the auditory modality where information about a word accrues over time, N400 data (Van Petten et al 1999) made clear that the processing of predictable and unpredictable words diverges prior to a word’s recognition point – essentially as soon as the system detects contextually mismatching perceptual information. ERP data have also suggested, however, a shift toward more passive, bottom-up processing strategies in many (although not all) healthy, aging individuals (Wlotko et al in press).
There are thus multiple routes to successful comprehension, and studies combining ERP measures with visual half-field presentation methods, used to bias processing toward the contralateral hemisphere, have made clear that it would be an error to conceive of comprehension as unfolding along a single processing stream, even within a given individual of a particular age (for reviews, see Federmeier 2007; Federmeier et al 2008). Such data reveal both important contributions to meaning comprehension from RH processing as well as important hemispheric differences, including the extent to which processing in each hemisphere is sensitive to prediction-based influences. Federmeier (2007) has hypothesized that an important source for such asymmetries may be more efficacious top-down connections in the LH, supported by LH-dominant language production mechanisms. Regardless of the precise nature and source of the differences, however, ERP data make clear than any complete theory of language processing will have to acknowledge the separate contributions of these two processing streams and explain how they are integrated to serve comprehension goals.
Collectively, N400 sentence processing data point to a language comprehension system that makes use of all the information it can as soon as it can in order to deal with a rapid, noisy input stream. Such data helped to compel a shift away from models that treat word recognition as a relatively isolated, data-driven process and also led to a reconsideration of the “postlexical” view of the N400. With that, came a surge of interest in using the measure to examine aspects of word recognition, including the nature and influence of orthographic and phonological levels of structure, and the representation and processing of morphology. The results lined up well with interactive views, revealing highly intertwined and sometimes multiple effects of many lexical variables (reviewed in Barber & Kutas 2007). Furthermore, the fact that not only words but also pseudowords (e.g., GORP) and even illegal strings (e.g., NKL) were subject to the processing reflected in the N400 (Laszlo & Federmeier, 2009) and similarly sensitive to language-relevant variables (such as orthographic neighborhood size) suggests that the process of categorizing inputs as lexically represented or even orthographically regular occurs in parallel with attempts at meaning access.
ERP evidence thus suggests that the word recognition process is extended over time, with critical aspects only beginning to take place around 200 ms after word onset; N400 data have mapped out a very similar time course for face recognition (reviewed by Schweinberger & Burton 2003). Although this accords with more general views of the time course of processing in the brain, as well as with what is known about the nature and neural source of components preceding the N400, it has sparked some controversy. In particular, the time course of processing suggested by ERPs is striking given that eye movements during natural reading tend to be fast and that some models of eye movement control assume that word recognition drives (and hence precedes) saccades (Reichle et al 2003). Thus, N400 data place important constraints on our understanding of when and how words are recognized and linked to meaning, with implications not only for psycholinguistic theories but also for, e.g., models of eye movement control.
In parallel with studies of the factors affecting adult word recognition is a growing literature harnessing the power of ERPs to study language learning and development. The neural architecture necessary to support N400 processing seems to be available quite early, as N400 effects have been reported in children as young as 9 months viewing unexpected events in action sequences (Reid et al 2009). In the domain of language, picture/word congruity N400 effects have been seen by 12–14 months of age, and vary with (productive) vocabulary (Friedrich & Friederici in press). Semantic congruity effects in sentences have been reported by 19 months, and are predictive of language skill at 30 months (Friedrich & Friederici 2006).
ERPs have also been used to address a number of important issues in the area of bilingual language processing, including questions about critical periods, effects of language proficiency and dominance, relationships between a bilingual’s two languages, and effects of code-switching, among others (reviewed in Kutas et al 2009). N400 semantic priming and sentence congruity effects have been reported for more than a dozen languages in both monolinguals and bilinguals, with no evidence that N400s differ in their timing or topography as a function of specific language characteristics or writing systems. N400 parameters, however, are sensitive to an individual’s proficiency with a language, even if they know only one, making them an excellent tool for investigating competence in adult second language acquisition. McLaughlin et al. (2004), for instance, showed that N400 amplitude reductions could distinguish words from pseudowords with only 14 hours of classroom instruction in a second language, and semantically related words from unrelated ones with as little as 63 hours. These reliable N400 effects were accompanied by chance level overt word and relatedness judgments, highlighting the sensitivity of ERP measures to early, implicit aspects of learning. More typical bilinguals show N400 priming and semantic congruity effects in both their languages, with the timing (and to a lesser extent the amplitude) of these effects a function of language proficiency and age of acquisition – being later and smaller for less well-learned languages.
N400 studies expanded into much richer nonlinguistic contexts (reviewed in Sitnikova et al. 2008), looking at congruency effects within picture sequences conveying a story, photos of objects in a visual scene (roll of toilet paper versus soccer ball in a soccer game), and short videos of everyday events (razor versus rolling pin used as a razor in a clip of a person shaving). The N400 effects elicited in these paradigms resemble lexico-semantic N400s in morphology and timing, with some differences in scalp topography (more frontal than those seen for written abstract words) – although temporal overlap with an earlier, frontally-distributed negativity (N300) observed in such paradigms complicates topographic assessments. Willems et al. (2008) directly compared speech and picture N400 effects as individuals listened to sentences in which a critical word, a coincident depicted object, or both could be contextually congruent or incongruent. Relative to the wholly congruent condition, all mismatch conditions yielded larger fronto-central N400s indistinguishable in amplitude, latency or scalp topography. Such data argue against any view of sense-making in which linguistic information is treated differently (in time or nature) than non-linguistic (objects, scenes) information. Indeed, even for linguistic stimuli, emerging N400 data made clear that many different kinds of features, including shared physical shape (e.g., between coin and button: Kellenbach et al 2000) and emotional valence (Schirmer & Kotz 2003), contribute to semantic analysis. The view of the N400 as a marker of “language processing” is thus shifting toward a view of the N400 as reflecting meaning processing more broadly. Correspondingly, the N400 is increasingly appreciated as a tool for examining object and face recognition, as well as action and gesture processing.
Relative to typical actions, actions that are purposeless, inappropriate and/or impossible along various dimensions (e.g., cutting jewelry on a plate with a knife and fork, cutting bread with a saw, standing on one foot in middle of desert) elicit N400 effects, suggesting functional similarity in comprehending everyday scenes and linguistic expressions of such (e.g., Proverbio & Riva 2009). Critically, both the appropriateness of the object used in an action (e.g., inserting screwdriver versus key into a keyhole) and the appropriateness of features of the motor act itself (e.g., orientation of the object with respect to the keyhole) individually and jointly modulate N400 amplitudes (Bach et al 2009), although topographic and durational differences implicate partially non-overlapping neural systems. Other N400 data have shown that the relationship between hand shape and the shape of an object to be grasped is used by observers to make sense of the details and cooperativeness of interpersonal actions (Shibata et al 2009). This emerging line of N400 work thus shows that motor and object features make early, parallel contributions to how an action is understood. Moreover, preparation to execute a meaningful (but not meaningless) action influences the N400 to a word (related or not to the action’s goal) presented prior to action execution, indicating that semantic activation may be inherent in action preparation (van Elk et al. 2008). Clearly, actions can serve as a semantic context for words, and the N400 as a means of assessing how and when conceptual knowledge, language, and action converge.
The relationship between motor activity and language has been further explored in the domain of gesture, with N400 data providing strong evidence that gestures are analyzed and used semantically. N400 effects have been reported for gestures that mismatch a prior spoken word with respect to an object property and those that mismatch an action in a preceding cartoon sequence or spoken sentence (Kelly et al 2004; Wu & Coulson 2005). Moreover, when either of a co-occurring gesture and spoken word was semantically anomalous (versus congruent) with an ongoing spoken sentence context, the resulting N400 effect had the same time course (Ozyurek et al 2007). N400 data thus have demonstrated that body movements can influence ongoing language comprehension almost immediately and in a manner functionally indistinguishable from linguistic inputs.
Mathematics is another domain, usually considered quite distinct from language, which has shown canonical N400 effects. Whether using an arithmetic verification task or a more implicit probe task (Galfano et al 2004), responses to incorrect (versus correct) solutions are characterized by centro-parietal negativity, which within-subjects comparisons indicate is virtually identical to that for semantically incongruous words in written sentences (Niedeggen et al 1999). Perhaps even more importantly, the arithmetic N400 effect shows a remarkable functional similarity to lexico-semantic ones: its amplitude is similarly sensitive to relations between items in long-term memory and is dissociable from concomitant RT measures. For example, Nieddegen and Roesler (1999) recorded ERPs and RTs for multiplication facts (5 × 8) that were either correct (40) or incorrect, and, when incorrect, varied in numerical distance from the correct product and were either related (32, 24, 16) or unrelated (34, 26, 18) to the operands. Both these factors influenced RTs, additively. N400s, however, were small to correct solutions, equally large to all unrelated solutions and to distant, related solutions, and intermediate in size to incorrect solutions that were both close and related. Thus, although the neural systems need not be identical, it certainly seems that similar functional principles are at work during the processing of semantic/linguistic and arithmetic knowledge.
For a time, memory research parted ways with the rest of the N400 literature, in large part because the N400-like response studied in the context of recognition memory came to be regarded as a separable component – the “FN400” (frontal N400) – based on its apparently different scalp topograhpy and purported link with explicit, familiarity signals. In a pivotal study, Curran (2000) investigated FN400 effects using a plurality reversal manipulation shown to reduce recollection but to have little effect on familiarity. This manipulation had no effect on FN400 repetition effects, but did influence ERPs associated with recollection (LPCs), supporting an association between FN400 and familiarity. Many studies followed using similar techniques to associate FN400s with familiarity and LPCs with recollection thereby supporting dual process memory models that posit a qualitative distinction between the brain areas and processing involved in feelings of familiarity from that involved in conscious recollection (reviewed in Rugg & Curran 2007).
However, more recently, some have questioned the distinction between the N400 and FN400. Topographical differences (which are difficult to interpret given the possibility of component overlap) and paradigmatic differences notwithstanding, no study has actually dissociated the N400 from the FN400. Indeed, some evidence indicates functional similarity. For instance, like the N400, the FN400 does not vary in latency as a result of multiple encounters with the same stimulus (Johnson et al 1998). Moreover, Paller and colleagues (2007) have argued that empirical evidence showing the FN400 is not sensitive to conscious recollection does not necessarily warrant the conclusion that it must be related to familiarity.
Thus, some recent research has endeavored to reconnect the two literatures by positing that in the context of memory the FN400 is actually a marker of facilitated conceptual processing due to repetition, or “conceptual priming” – and not directly related to feelings of familiarity. On this view, then, the FN400 is an N400, elicited by meaningful stimuli in recognition tasks and modulated in amplitude when prior exposure renders semantic processing easier. To dissociate conceptual priming from familiarity, these studies have measured both in highly similar circumstances in order to identify ERPs that vary with one versus the other. Voss and Paller (2006), for instance, found that priming conceptual information associated with celebrity faces led to FN400 effects that covaried with the magnitude of priming, but, critically, not with familiarity for the same faces. Another line of research used abstract geometric shapes, which vary in their meaningfulness to individual participants and thus in the extent to which they can support conceptual priming with repetition, but which can be made more familiar by repetition, independent of meaningfulness (Voss & Paller 2007). Only meaningful shapes produced FN400 effects, indicating that it is the potential for conceptual priming, not familiarity, which drives FN400 modulations. Such studies collectively provide strong evidence against purported functional distinctions between the FN400 and the N400, indicating instead that the processing responsible for the N400 is also active during recognition memory tasks.
With the rapid growth of the literature have come attempts to develop theories of the N400. Some of these are anchored at the neurobiological level, seeking to delineate the brain network(s) responsible for the N400 (Lau et al 2008) and linking the component to specific neural functions, such as binding (Federmeier & Laszlo 2009). Others are framed at a functional level, mapping the N400 onto particular cognitive operation(s), such as orthographic/phonological analysis (Deacon et al 2004), semantic memory access (Kutas & Federmeier 2000; van Berkum 2009), or semantic/conceptual unification (Hagoort et al 2009). Many of these functional views are based on the underlying assumption that comprehension involves a feedforward series of processes in which words are analyzed first as perceptual objects and then as linguistic objects (lexical processing), culminating in the match between a phonological or orthographic input and a representation in the “mental lexicon” – i.e., in word recognition. Critically, upon recognition, semantic (among other types of) information becomes available and can then be integrated with the current mental model of the unfolding sentence or discourse. A similar stream of increasingly complex perceptual analyses leading to recognition, which then affords semantic access, also has been posited for face and object processing. On such a view, the N400 might be functionally characterized as arising from one (or more) of these processing steps, and, indeed, which one(s) of these processes the N400 reflects is what distinguishes most of the currently competing theories.
On one end of the spectrum are views (Brown & Hagoort 1993; Hagoort et al 2009) that position the N400 relatively late (post item recognition) in this processing stream, associating the N400 to processes linking up (“integrating”) the semantic information accessed from the current word with meaning information encompassing multiple words (e.g., sentence or discourse message-level representations, presumably held in working memory). In particular, Hagoort et al. (2009) identify the N400 with semantic “unification” processes, defined as “the integration of lexically retrieved information into a representation of multi-word utterances, as well as the integration of meaning extracted from non-linguistic modalities”, placing special emphasis on the constructive nature of the meaning integration (“a semantic representation is constructed that is not already available in memory.”) Views of this type that associate the N400 with “post-lexical” aspects of semantic analysis can readily account for the multimodal and cross-modal nature of the N400, on the assumption that the various meaning-laden stimulus types ultimately converge on shared (or at least partially overlapping) conceptual level representations. They can also easily explain both the sensitivity of N400 amplitude to pragmatic and discourse-level manipulations, and the relative precedence, in many cases, of such high-level factors over lower-level ones. However, a challenge for such late-stage accounts is the presence of N400s to stimuli that are not lexically represented in the mental lexicon (pseudowords and even illegal strings), and the emergence of N400 effects to lexically represented stimuli prior to their recognition point (i.e., before a listener knows which word s/he is hearing). Furthermore, for all of these stimulus types, N400 amplitude is sensitive to a whole host of factors whose sole or primary influence is presumed to be at earlier (pre-lexical or lexical) processing stages, such as orthographic neighborhood size and neighbor frequency, lexical (and subpart) frequency, orthographic and phonological similarity, and repetition. Finally, integration/unification processes have been linked to top-down control mechanisms, yet N400 effects have been observed under conditions in which such control seems unlikely, as for semantic priming effects under the attentional blink or repetition effects in amnesic patients.
On the other end of the processing continuum, the fact that, e.g., N400 repetition effects are seen even for pseudowords with little resemblance to known words – i.e., stimuli that are not represented in the mental lexicon and thus presumably cannot have associated semantics -- has led others to postulate that the N400 reflects processing stages prior to word recognition and semantic access, such as orthographic and/or phonological analysis (Deacon et al 2004). The strengths and weaknesses of this pre-recognition view are a mirror image of the integration view: this framework provides a straightforward explanation for basic lexical level influences on the N400 but no obvious account of discourse effects and their precedence in shaping N400 patterns. Moreover, as this account is word-specific, N400 effects to non-word stimuli must be assumed to arise from functionally similar – but nevertheless distinct – neural activity.
The broad sensitivity of the N400 to both lower and higher level factors that impact meaning processing has spawned a number of accounts that position the N400 at the junction where these processes intersect – namely, at the level of semantic access itself (Kutas & Federmeier 2000; see also Lau et al 2009; van Berkum 2009). However, even this “middle ground” approach cannot explain the full range of N400 data if the assumptions of the traditional processing model are maintained. For example, on the assumption that a word must be recognized before its meaning can be accessed, N400 effects for non-lexically represented stimuli remain inexplicable. Similarly, on the assumption that processing is wholly or largely feed-forward, it becomes difficult to reconcile the immediate and often dominant influences of more global aspects of context on initial semantic access.
Given that the N400 does not readily map onto specific sub-processes posited in traditional frameworks, which have been built largely from behavioral and linguistic evidence, it may prove more fruitful to use what has been learned about the N400 to reshape the underlying conceptualizations of how comprehension unfolds, in ways that are more constrained by our understanding of neural processing. In the context of the typical stream of brain activity triggered by an incoming stimulus, the N400 can be characterized as a temporal interval in which unimodal sensory analysis gives way to multimodal associations in a manner that makes use of – and has consequences for – long term memory. Processing in the first 200 or so milliseconds after the onset of a potentially meaningful stimulus is dominated by brain activity related to perceptual analysis, which differs across modality in its spatial and temporal characteristics as well as in its sensitivity to factors like attention. With the N400, then, these different input streams converge -- temporally, spatially and functionally. Given notable variability across stimuli in factors such as familiarity and perceptual complexity, it would seem that the time needed to settle into a final, stable state of perceptual analysis (i.e., “recognition”) must necessarily differ for different types of input (and, likely, also as a function of task and context). Indeed behavioral and other electrophysiological indices (such as the P3b) change their latency relative to the stimulus of interest in ways that are systematically related to these factors (e.g., Kutas et al 1977). Yet, such variables routinely affect the amplitude – but usually not the timing – of the N400 (see review in Federmeier & Laszlo 2009). This implies that initial access to long-term multimodal (“semantic”) memory, as indexed by the N400, occurs at different points along the apprehension-to-recognition continuum for different stimuli and under different conditions: some stimuli may be “recognized” before access but, for others, access may be initiated before recognition is complete. In other words, access to item-associated information in long-term memory (LTM) may be decoupled from recognition.
Because the transition from unimodal to multimodal processing (e.g., from wordform to the concept that word brings to mind) is neither dependent upon nor driven by a particular functional outcome of perceptual analysis, all types of stimuli, from the highly practiced to the completely novel, would be expected to elicit N400 activity to some degree, with the amount and nature of that activity a function of the stimulus-induced state of the perceptual system at the time that semantic access is initiated. For example, to the extent that neural representations are distributed and/or the activation of stimulus features is noisy (e.g., that the visual input C-A-B activates not only “CAB” but also, to some extent, “DAB”, “CUB”, and “CAR”), semantic information associated with all of these co-activated representations will come online together, with a strength proportional to the strength of the eliciting perceptual signal. Rather than reflecting the activation of “a word’s meaning”, then, the N400 region of the ERP is more accurately described as reflecting the activity in a multimodal long-term memory system that is induced by a given input stimulus during a delimited timewindow as meaning is dynamically constructed (see discussion in Laszlo & Federmeier submitted). Such activity is observed for nonwords as for words, rendering the system more robust to input noise and providing a mechanism for implicit conceptual-level learning of novel stimuli (Gratton et al 2009). Baseline activity levels, however, will be higher for some stimuli than for others – because they are similar to or associated with many other things stored in memory (e.g., have high orthographic neighborhood sizes, large cohorts, or many lexical associates) and/or because they are linked to more meaning features (e.g., are ambiguous, more polysemous, concrete as opposed to abstract, etc.). Given evidence that the semantic feature information being accessed is widely distributed across the neural network, it also follows that different stimuli and types of stimuli (e.g., words and pictures) elicit functionally similar but spatially different activity patterns across this distributed network, resulting in the observed differences in the topography of scalp-recorded N400s.
Although the N400 reflects stimulus-induced semantic activity in LTM, it does not necessarily follow that the activation states of the semantic memory system as a whole are strictly a function of the current input (eliciting stimulus) or, indeed, even of feed-forward stimulation in general. There is presumably always activity in the semantic system, and that activity is in constant flux in response to both external and internal events and states. For example, information that is encountered more often may tend to have higher baseline states of activity, and information that has been accessed recently – due to stimulus repetition or featural overlap – also will tend to be more active. Furthermore, activation states can be modulated by internally generated events, such as recalling a stimulus or predicting an upcoming one. Finally, a wide variety of state and trait-based factors – e.g., mood (Federmeier et al 2001), schizotypy (Kiang & Kutas 2005), etc. – as well as task demands and goals may change activity levels in semantic memory, at the global or more local levels, if not both. In all of these cases, to the extent that some information represented in LTM is already partially or wholly active by the time semantic access for a given input stimulus is initiated, that information will not need to become active in response to the input. Thus, preactivation of semantic information, by any means, will tend to reduce baseline N400s to a stimulus that would normally activate that information. As a consequence, in an experimental context the N400 response to a given input can be used as a tool to assess semantic memory states, with the amount of N400 reduction (relative to a control condition) revealing how much of the information normally elicited by that stimulus is already active.
The N400 window thus provides a temporally-delimited “electrical snapshot” of the intersection of a feedforward flow of stimulus-driven activity with a state of the distributed, dynamically active neural landscape that is semantic memory. As such, N400 activity can be modulated by factors that affect either the input stream or the configuration of activity in semantic memory. Manipulations of attention, for instance, may affect either or both of these levels. Attentional manipulations may serve to preactivate information in semantic memory by rendering some information more important and/or more predictable under one task condition than another, and/or by modulating the strategies that participants choose to or can use – for example, what kind of controlled processes are brought to bear in order to remember, integrate, or disambiguate inputs – with consequences for the state of semantic memory then encountered by subsequent stimuli. Such manipulations might thus affect the size of N400 effects observed. Van Berkum (2009) also emphasizes that semantic retrieval can be “intensified” by attention. On the view we are building here, one way that this could happen is by effects of selective attention on the sensory input. Much ERP work details how selective attention to space, objects, and various perceptual features of objects can modulate the amount of feedforward activity elicited by an incoming stimulus (Luck et al 2000). To the extent that the N400 is part of this feedforward stream, selective attention would be expected to correspondingly modulate even baseline N400 amplitudes. However, just as selective attention generally does not eradicate sensory ERP components, it would seem unlikely to eliminate all signs of N400 activity, and, since the nature of attentional effects differs across modalities, N400 modulations by attention would likely as well. We suggest that attention can thus serve to alter the balance between the contributions of feedforward and topdown activity to the processing of meaning. That balance also seems to be importantly different between the two cerebral hemispheres (Federmeier 2007), with concomitant impact on the N400 seen when processing is biased toward the LH and RH (which have separable feedforward perceptual streams, each capable of triggering and modulating an N400).
In sum, the N400 arises from a time period in which stimulus-driven activity enters into temporal synchrony with a broad, multimodal neural network, whose current states have been shaped by recent and long-term experience of a wide range of types (e.g., based on world experience, long-standing and recent linguistic and nonlinguistic inputs, attentional states, affect/mood, etc.). Federmeier and Laszlo (2009) have hypothesized that this temporal synchrony effects a binding, creating a multimodal conceptual representation. Notably, on this view, conceptual representations are not “looked up” in memory but are dynamically created and highly context dependent: because semantic memory states are continuously changing, the meaning of a given stimulus, defined as the configuration of neural activity that is bound together in response to that stimulus, will be somewhat different across people, time, contexts, and processing circumstances. The binding that occurs during the N400 is implicit and transient – although the activity elicited by a given input will have an influence for a short time (e.g., in the form of repetition or conceptual priming effects), even when explicit memory systems are compromised (as in amnesics, for example). Furthermore, given that the N400, and effects on it, can be obtained under conditions of reduced awareness, it seems unlikely that N400 activity is directly responsible for the conscious experience of meaning (although the presence of N400 activity may be a useful marker of whether neural systems have the properties and/or integrity to support such awareness: Schoenle & Witzke 2004). Initial conceptual representations may thus be lost, or may enter into consciousness, be stored in working memory, and/or (assuming an intact hippocampus) come to be stored in long-term memory.
More generally, this highlights the fact that the meaning of a stimulus is not computed at a single point in time, but rather something that emerges through time, with the activity measured in the N400 representing an important aspect of that emergent process, but not, certainly, the final state. Indeed, at the time of the N400, meaning states may still be incoherent, either because a given stimulus elicits more than one disparate meaning (i.e., is ambiguous or, as in the case of unfamiliar nonwords, broadly elicits activity associated with similar inputs) or because context information has induced one state and the incoming stimulus a different one (e.g., in the case of semantic anomalies, but also more generally when unexpected words are encountered or contextually-induced predictions are disconfirmed). Thus, initial conceptual representations, as reflected in the N400, will often need to be refined with time, either through continued interactions within semantic memory or via the application of later-occurring processes that serve to select meaning features, revise initial interpretations, or otherwise update meaning representations (for example, adding information that might not have become available by the time the N400 was triggered; see discussion in Federmeier & Laszlo 2009).
Although an emerging theory of the neural and functional nature of the N400 may help to organize the growing literature and spark novel predictions, the outcome of current theoretical debates would have remarkably little impact on the contributions of the great majority of the work highlighted throughout this piece using the N400 as a dependent measure. Impressive gains have been made across a whole host of cognitive domains, and, perhaps most critically, the N400 has been instrumental in breaking down barriers between those domains. The N400 literature, taken as a whole, provides a compelling picture of how perception, attention, memory, and language jointly participate in the neural events responsible for the N400 – and thus together contribute to the amazing ability of the human brain to infuse its world with meaning.
M. Kutas, K. Federmeier, and much of the work cited herein were supported by grants HD022614 and AG83013 to MK and AG26308 to KF. We gratefully thank G. Dell, V. Ferreira, S. Laszlo, C. Lee, T. Münte, T. Urbach, J. Voss, and E. Wlotko for their comments on a previous version of this article. There would be little to review had it not been for the hundreds of studies and reports by scientists worldwide. We salute you. We wish we had the space to cite each and every one of the articles that shaped our writings; instead, due to space constraints, we primarily cite reviews for all but the recent years. We thank you all.
Marta Kutas, Departments of Cognitive Science and Neurosciences, Center for Research in Language, Kavli Institute for Brain and Mind, University of California San Diego.
Kara D. Federmeier, Department of Psychology, Program in Neuroscience, and The Beckman Institute of Advanced Science and Technology, University of Illinois, Urbana.