|Home | About | Journals | Submit | Contact Us | Français|
Speech unfolds swiftly, yet listeners keep pace by rapidly assigning meaning to what they hear. Sometimes though, initial interpretations turn out wrong. How do listeners revise misinterpretations of language input moment-by-moment, to avoid comprehension errors? Cognitive control may play a role by detecting when processing has gone awry, and then initiating behavioral adjustments accordingly. However, no research has investigated a cause-and-effect interplay between cognitive control engagement and overriding erroneous interpretations in real-time. Using a novel cross-task paradigm, we show that Stroop-conflict detection, which mobilizes cognitive control procedures, subsequently facilitates listeners’ incremental processing of temporarily ambiguous spoken instructions that induce brief misinterpretation. When instructions followed Stroop-incongruent versus-congruent items, listeners’ eye-movements to objects in a scene reflected more transient consideration of the false interpretation and earlier recovery of the correct one. Comprehension errors also decreased. Cognitive control engagement therefore accelerates sentence re-interpretation processes, even as linguistic input is still unfolding.
That people occasionally misunderstand what a speaker says isn’t newsworthy. Yet comprehension errors are usually fleeting—how do listeners revise instantly before communication hits an impasse, and fails? People, after all, interpret language input moment-by-moment despite the hurtling rate of speech (Allopenna, Magnuson, & Tanenhaus, 1998; Altmann & Kamide, 1999). However efficient, sometimes early interpretations must be revised when disconfirming evidence arrives promptly after the prior word.
The phrase “on the plate” could specify either the dumpling’s goal, or modifying information about one dumpling that distinguishes it from another (e.g., one on a platter). Listeners commit rapidly to the goal analysis, particularly when no other dumpling is mentioned, because “put” requires a goal (Spivey, Tanenhaus, Eberhard, & Sedivy, 2002). Yet, “into the wok” eventually signals the true goal of “put”, compelling listeners to quickly revise their default characterization of “on the plate” as the dumpling’s endpoint. What cognitive mechanics underlie listeners’ ability to revise misinterpretations as input unfolds, to avoid comprehension fiascos?
One proposal claims that domain-general cognitive control procedures, which detect and resolve conflict during information-processing through flexible behavioral adjustments (Botvinick, Braver, Barch, Carter, & Cohen, 2001), also enable revision following linguistic misanalysis (Novick, Trueswell, & Thompson-Schill, 2005). Correlational patterns support this view: frontal patients with cognitive control deficits fail to correct misinterpretations, for instance by putting the dumpling in (1) on the plate, not in the wok after their eyes “lock-on” to the false goal in a visual scene (Novick, Kan, Trueswell, & Thompson-Schill, 2009; Vuong & Martin, 2011). Young children similarly fail (Trueswell, Sekerina, Hill, & Logrip, 1999), related to protracted cognitive control development (Mazuka, Jincho, & Oishi, 2009; Nilsen & Graham, 2009). And, neuroimaging data reveal overlapping brain activity when healthy adults interpret spoken ambiguities and complete non-syntactic cognitive control tasks like Stroop, implying shared resources (Fedorenko, 2014; January, Trueswell, & Thompson-Schill, 2009; Ye & Zhou, 2009). These associative findings suggest that general-purpose cognitive control functions may help listeners revise “impulsive” interpretations of sentence meaning. But is cognitive control engagement what causes real-time revision to be fairly trouble-free?
We investigate if relative cognitive control engagement facilitates revision. Our approach hinges on a key phenomenon of human cognition: that conflict detection initiates sustained cognitive control, attenuating the cost of processing subsequent conflict (Gratton, Coles, & Donchin, 1992; Ullsperger, Bylsma, & Botvinick, 2005). For example, conflict experienced during the Stroop task (“yellow” in blue ink) diminishes if preceded by another conflict trial, versus a non-conflict trial (“blue” in blue ink) (Freitas, Bahar, Yang, & Banai, 2007; Kerns et al., 2004). This behavioral savings reflects on-the-fly adjustments—“conflict adaptation.” Can such dynamic mobilization of cognitive control similarly tune listeners’ incorrect language-processing commitments? Using a novel cross-task adaptation paradigm, we interleaved Stroop trials with a language comprehension task involving syntactic ambiguity to test if non-syntactic conflict detection initiates domain-general cognitive control processes that facilitate real-time recovery from misinterpretation. We recorded eye-movements as listeners executed spoken instructions, because language input directs attention to relevant objects in the environment; thus, fixation patterns provide important information about listeners’ ongoing interpretative commitments (Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). We therefore examined whether listeners’ moment-to-moment interpretations depend on the relative engagement status of cognitive control, determined by prior Stroop-trial type.
Twenty-three right-handed people (seven men; Mage = 20.4 years, range = 18–28 years) were paid $10/hour for participating. They were healthy, native monolingual speakers of American English with normal or corrected-to-normal vision who provided informed consent. The human subjects review board at the University of Maryland approved all procedures. We excluded data from three subjects because their eye-tracking calibration was poor (>33% of their data was lost). In the twenty remaining subjects (see Chambers, Tanenhaus, & Magnuson, 2004; and Trueswell et al., 1999 for similar sample sizes), we excluded trials with >33% track loss (5.3% of the dataset).
All stimuli were presented in Experiment Builder (SR Research, v1.10.1241).
Participants indicated the ink color of color terms presented on a computer screen, using a three-button mouse to log responses. Color names matched ink colors on 60 congruent trials (“blue”, “green”, and “yellow” in blue, green, and yellow ink, respectively) but mismatched on 60 incongruent trials. We used only response-ineligible color names in the incongruent condition, where the color term did not match any color in the response set (“brown”, “orange”, and “red” in blue, green, or yellow ink). These trial types create conflict between representations, not responses (Milham et al., 2001), ensuring that the control mechanisms that engage would be the same as those needed for the language task, where conflict also arises from two different representations (i.e., interpretations). Response-based conflict may be handled by other control mechanisms (see Egner, 2007, 2008).
Eye-movements were monitored with an EyeLink 1000 eye-tracker (temporal resolution: 1000 Hz; spatial resolution: ≤ 1.5°). Subjects listened to and carried out spoken instructions, prerecorded by a female speaker, to “drag-and-drop” objects around a visual scene on a computer display. For example:
In (2), like example (1) earlier, “on the napkin” is temporarily ambiguous between a goal interpretation (where to put the frog) and a modifier interpretation (the frog to be moved is currently on a napkin). As this phrase unfolds, listeners fixate on an empty napkin in a scene (an incorrect goal location), revealing early consideration of the goal analysis of “on the napkin.” However, “onto the box” provides late-arriving disambiguating evidence that signals the true destination, forcing listeners to revise. Eye-movements to the correct goal—a box in the scene (see Figure 1)—are delayed compared to unambiguous instructions, suggesting extra time needed to disengage from the incorrect goal analysis of “on the napkin” (Novick, Thompson-Schill, & Trueswell, 2008). In (3), “that’s” removes the temporary ambiguity, imposing the modifier interpretation.
Subjects heard 24 ambiguous, 24 unambiguous, and 48 filler sentences that all contained imperative “Put” instructions. To minimize the salience of our manipulation and prevent subjects from learning that post-nominal prepositional phrases (e.g., “on the napkin”) are always reduced relative modifiers, we designed filler trials whose scenes visually resembled potential ambiguous or unambiguous trials (like Figure 1), but whose instructions contained a post-nominal locative prepositional phrase (e.g., “Put the cow on the sweater”, where the sweater is the correct destination). This design feature, importantly, should block subjects from adapting to reduced relative constructions (Fine, Jaeger, Farmer, & Qian, 2013) and, therefore, avoiding empty napkins (or bowls, or plates) as possible responses, because that strategy would be wrong half the time. Also, because filler sentences followed Stroop items in the same way that experimental sentences did, there were, crucially, no contingent or predictive relationships between prior Stroop-type and the current sentence-type, namely whether “the napkin” would be the correct or incorrect goal (Schmidt & De Houwer, 2011).
Participants used one mouse button to drag-and-drop relevant objects around the scene. We counterbalanced item locations (e.g., target, competitor, incorrect goal, correct goal) within and across conditions. We created two lists that counterbalanced sentence ambiguity within items: if an item was ambiguous in one list, it was unambiguous in its counterpart list (by inserting “that’s”). Ten subjects were randomly assigned to each list; we stopped data collection once subject numbers were balanced across lists.
We presented participants first with twenty color patches on the screen to learn the mouse-button/color-response mappings, which we counterbalanced across subjects. Then, subjects practiced a block of 144 Stroop trials (intermixed congruent and incongruent in equal proportion) before completing the pseudo-randomly interleaved Stroop-sentence sequences.
To test for cross-task conflict adaptation—namely, if Stroop-conflict detection mobilizes cognitive control procedures that expedite listeners’ subsequent ability to recover from misinterpretation—we pseudo-randomly interleaved Stroop trials and language comprehension trials. We created 12 instances each of four conditions of interest: congruent-ambiguous (CA) pairs, incongruent-ambiguous (IA) pairs, congruent-unambiguous (CU) pairs, and incongruent-unambiguous (IU) pairs. That is, congruent (C) or incongruent (I) Stroop items (trial n−1) could precede either ambiguous (A) or unambiguous (U) sentences (trial n), thereby determining the engagement status of cognitive control during spoken language comprehension (Figure 1). Beyond these pairings, the inclusion of 48 filler sentences (see above) and 72 other Stroop trials created several Sentence-to-Stroop pairs, Sentence-to-Sentence pairs, Stroop-to-Stroop pairs, and Stroop-to-Sentence pairs, which prevented subjects from predicting upcoming trial or task type and from learning contingent relationships between tasks (prior and current trial-type). Given our focus on how cognitive control engagement affects language comprehension per se, analyses and discussion will focus primarily on listeners’ real-time processing of the instructions as the outcome measure.
On each trial, the mouse cursor appeared at the center of the screen for 500 ms, serving as a fixation. If the current trial was a Stroop trial, the cursor disappeared and was replaced with the stimulus item, which remained on the screen for either 1000 ms, or until the participant responded, whichever came first. If the current trial was a sentence trial, all objects in the display simultaneously appeared around the cursor. After a 300-ms delay, subjects heard the instruction. They could move the mouse to execute an action as soon as the objects appeared. A digital camcorder filmed the computer screen to capture subjects’ drag-and-drop actions.
On Stroop trials, we collected accuracy and response time data. On language comprehension trials, we collected mouse-action and eye-movement data. For mouse-actions, participants performed a correct action if they dragged the target directly to the correct goal (e.g., box); they performed an incorrect action if they dragged the target to the incorrect goal (e.g., empty napkin) (see Trueswell et al., 1999). For eye-movements, each quadrant of the screen was labeled as an interest area (Figure 1), and sample reports from Data Viewer (SR Research, v2.2.62) determined fixation proportions.
Our results are organized to address three questions: When cognitive control is relatively activated, do listeners (1) commit fewer offline action errors (involving the incorrect goal); (2) have an easier time revising, reflected in earlier looks to the correct goal (e.g., the box); and (3) consider the incorrect goal (e.g., empty napkin) to a lesser extent?
Statistical analyses were performed using the lme4 software package in R (unless otherwise stated). We fit mixed-effects logistic regression models that included sentence-type, Stroop-type, and their interaction as fixed effects to predict (1) action responses; (2) proportion of looks to the correct goal (indicating revision of an initial misinterpretation); and (3) looks to the incorrect goal (indicating consideration of the wrong interpretation). We crossed subjects and items as simultaneous random effects on the intercept, and note in the text when random slopes improved overall model fit (see Huang, Zheng, Meng, & Snedeker, 2013).1 The critical contrast is between incongruent-ambiguous (IA) and congruent-ambiguous (CA) pairings, to test if Stroop-conflict detection, and thus cognitive control engagement, facilitates recovery from misinterpretation.
Figure 2A shows that subjects committed more action errors in ambiguous (M = 13.7%, 95% CI = 5.8%) than unambiguous (M = 2.2%, 95% CI = 1.8%; χ2(1) = 4.29, p = 0.038; slopes included) conditions. Figure 2B shows that prior Stroop-trial type, and thus relative cognitive control engagement, modulated the ambiguity effect, but had no impact under unambiguous conditions. This was confirmed by a significant Current-by-Previous-Trial-Type interaction (χ2 = 6.34, p = 0.012; slopes included): Specifically, subjects made reliably fewer errors under ambiguous conditions in IA (M = 7.9%, 95% CI = 5.0%) than CA pairings (M = 19.6%, 95% CI = 2.2%; χ2(1) = 6.63, p = 0.010), but not IU versus CU pairings. This result suggests that cognitive control engagement helps prevent comprehension errors.
Dwell times on the correct goal revealed the expected ambiguity effect: listeners fixated on the correct goal less during ambiguous (M = 32.8%, 95% CI = 2.5%) than unambiguous sentences (M = 37.4%, 95% CI = 1.6%; χ2(1) = 16.33, p < 0.001). Figure 3 plots the proportion of looks to the correct goal from the onset of “on the napkin” through the end of the sentence, split by previous Stroop-type. As can be seen, subjects looked more to the correct goal under ambiguous conditions when prior Stroop-type was incongruent versus congruent. Correct goal looks under unambiguous conditions were unaffected by prior Stroop-type. A significant Current-by-Previous-Trial-Type interaction confirmed this observation (χ2(1) = 6.14, p = 0.013): cognitive control engagement modulated consideration of the correct goal under ambiguous (but not unambiguous) conditions (CA: M = 30.5%, 95% CI = 2.6%; IA: M = 35.3%, 95% CI = 2.8%; p < .001). This interaction remained significant when including only correct-action trials (χ2(1) = 4.38, p = 0.036), i.e. when listeners correctly revise.
Figure 4A plots looks to the correct goal over time. Subjects fixated on the correct goal less during ambiguous versus unambiguous sentences upon hearing disambiguating evidence (“onto…”); but they appear to recover earlier if Stroop-conflict preceded the ambiguity. Indeed, Figure 4B zooms in on this time-course, showing that between 200–600 ms following the onset of “box”, listeners consider the correct goal earlier when cognitive control is engaged.2 This was confirmed by a significant Current-by-Previous-Trial-Type interaction: looks to the correct goal were greater during IA (M = 71.1%, 95% CI = 6.4%) versus CA pairings (M = 63.9%, 95% CI = 6.8%; χ2(1) = 6.57, p = 0.010), but not IU versus CU pairings. A similar pattern emerged when examining only correct-action trials (χ2(1) = 3.39, p = 0.065). Interestingly, as can be seen in Figure 4B, looking patterns in this 400-ms window for IA pairings paralleled those of unambiguous sentences (CU vs. IA: χ2(1) = 1.93, p = 0.165; IU vs IA: χ2(1) = 1.40, p = 0.236). This finding indicates that cognitive control engagement reliably accelerates recovery from misinterpretation.
Figure 5A plots dwell times on the incorrect goal from the onset of “napkin” until the onset of “onto”, showing the expected ambiguity effect: subjects looked more to the incorrect goal under ambiguous (M = 9.6%, 95% CI = 2.1%) than unambiguous conditions (M = 6.9%, 95% CI = 1.3%; χ2(1) = 33.25, p < 0.001). Figure 5B shows that preceding Stroop-type modulated the effect: consideration of the incorrect goal decreased under ambiguous conditions if the prior Stroop trial was incongruent. This was confirmed by a significant Current-by-Previous-Trial-Type interaction: relative cognitive control engagement reduced incorrect goal looks in IA (M = 7.6%, 95% CI = 2.1%) versus CA items (M = 11.8%, 95% CI = 3.1%; χ2(1) = 4.19, p = 0.041); no reliable difference emerged between IU and CU pairings. This finding suggests that cognitive control engagement attenuates consideration of the wrong interpretation and facilitates disengagement from it.
The evidence thus far strongly indicates that relative cognitive control engagement facilitates listeners’ recovery from sentence misinterpretation in real time. Yet, this interpretation assumes that cognitive control has been initially engaged during the Stroop task, which influences the subsequent spoken language comprehension task. Are the findings we’ve reported contingent on the experience of Stroop-conflict in the first place? We ask this in the context of prior evidence demonstrating that Stroop effects can dissipate over time with practice (MacLeod, 1991).3
Indeed, we observed a marginally significant three-way interaction among experimental half (first vs. second), previous trial-type, and current trial-type (χ2 = 9.11, p = .058), suggesting that our effects are moderated by time. Given the theoretical importance of evaluating the effectiveness of the Stroop manipulation, we therefore separated the data into halves to test the modulating effects of Stroop trial-type on subsequent syntactic ambiguity resolution. Paired-sample t-tests revealed that subjects were slower to respond to incongruent versus congruent Stroop trials during the first half (60 trials) of the experiment (incongruent minus congruent: M = 27 ms, 95% CI = 19 ms; t = 2.86, p = .010), but not the second half (incongruent minus congruent: M = −11 ms, 95% CI = 26 ms; t = −0.87, p > .250). Consequently, if conflict-control procedures per se are involved in shaping the revision of language interpretations, as we have argued, then the findings reported above (e.g., looks to the correct goal) should be observed in only the first half of the experiment (six IA vs. CA pairings), but should not persist into the second half. We therefore re-examined looks to the correct goal from the onset of “box” for each half of the data.
Importantly, as seen in Figure 6, planned contrasts revealed that in the first half of the experiment, when there was a significant Stroop effect, looks to the correct goal under ambiguous (but not unambiguous) conditions were modulated by prior conflict detection (CA: M = 33.1%, 95% CI = 6.7% %; IA: M = 51.9%, 95% CI = 6.5%; χ2(1) = 5.52, p = 0.019). In the second half of the experiment only, in the absence of a Stroop effect, this modulation disappeared (CA: M = 60.9%, 95% CI = 8.4%; IA: M = 63.9%, 95% CI = 8.0%; χ2(1) = 1.66, p = 0.197). Together, this evidence suggests that conflict detection per se, and the theoretical engagement of cogntive control, is actually influencing language re-interpretation processes.
How do listeners effortlessly override misinterpretations of linguistic input? Our findings reveal a cause-and-effect interplay between cognitive control engagement and revising erroneous processing commitments. When non-syntactic cognitive control resources mobilized following Stroop-conflict detection, listeners committed fewer comprehension errors, considered the correct interpretation more, and dwelled on the incorrect interpretation less, compared to relative cognitive control un-engagement. Moreover, revision was immediate: 200ms after the onset of “box”, eye-movements to the correct goal were indistinguishable from those during unambiguous instructions. This uncovered a 400-ms revision advantage compared to relative cognitive control inactivity, suggesting that adjusting misinterpretations moment-to-moment draws dynamically on general-purpose cognitive control procedures as linguistic input rapidly unfurls.
Ample evidence suggests that the language processing system readily accesses multiple linguistic and extra-linguistic cues to interpretation that guide listeners’ resolution of syntactic ambiguity (Altmann & Steedman, 1988; Chambers et al., 2004; Garnsey, Pearlmutter, Myers, & Lotocky, 1997). For example, a scene with two frogs helps listeners revise because it supports the modifier interpretation of “on the napkin” by providing distinguishing information about the intended frog (Novick et al., 2008; Spivey et al., 2002). While listeners can avoid ambiguity pitfalls by consulting multiple evidential sources, our findings discern aspects of a mental architecture that enables the system to monitor the coordination of these sources, which may conflict at any moment: Active cognitive control can mitigate conflicting cues to interpretation.
Two prior studies suggest a cause-and-effect interplay between cognitive control and language: training cognitive control predicts better sentence re-interpretation over time (Novick, Hussey, Teubner-Rhodes, Harbison, & Bunting, 2014), and readers’ discovery of a language misinterpretation attenuates subsequent Stroop-conflict effects (Kan et al., 2013); thus comprehension difficulty recruits cognitive control. Our research uniquely shows that dynamic recruitment of conflict-control procedures affects real-time recovery from misinterpretation.
Could these effects be ascribed to greater vigilance, rather than conflict-control procedures? Perhaps subjects become more watchful after “tricky” Stroop-incongruent trials. This account also predicts that listeners look less at the incorrect goal when hearing “on the napkin” because they are ready for trickery, and quickly recognize the phrase as a reduced relative modifier. However, filler sentences should have strongly prevented subjects from anticipating reduced relatives; thus, mere vigilance following Stroop-conflict would be an imperfect strategy (see also Kan et al., 2013).
Our filler design also safeguards against subjects implicitly tracking contingencies between Stroop and sentence tasks, because prior Stroop-type did not predict whether “the napkin” would be the correct or incorrect goal. Even if such cross-task contingencies existed, a learning account would predict increased, not decreased cross-task (Stroop-sentence) effects over time, as subjects notice any contingencies. Clearly, our data do not fit this pattern.
We conclude then that cognitive control may act when even minor language comprehension adjustments are necessary to avoid total failure. During processing, multiple interpretations at various levels (words, sentences) are rapidly activated and de-activated, even if these alternatives do not reach conscious awareness (Dahan & Gaskell, 2007; McRae, Spivey-Knowlton, & Tanenhaus, 1998). Dynamic engagement of cognitive control may conspire with linguistic and extra-linguistic constraints to reduce activation of irrelevant alternatives, enabling listeners to override language misinterpretations within milliseconds.
Thanks to Lucia Pozzan, Tina Woodard, Steven Devilbiss, Nicole Grap, and John Trueswell for assistance and discussion. The Center for Advanced Study of Language and the Eunice Kennedy Shriver National Institute for Child Health and Human Development [F32-HD080306] funded this work.
1For each analysis, we selected the final model by first including all fixed effects (i.e., the main effects and their interaction), then by removing predictors until the reduced model did not perform significantly better than the full model (p > .05). We also constructed each model with random slopes on the interaction, but random slopes were typically excluded from further analyses because their inclusion did not improve overall fit.
2We derived this window by first calculating the proportion of time spent looking at the Correct Goal from the onset of hearing “…box” on a ms-by-ms basis. We shifted the window 200 ms after “box” to account for the time that it takes to launch an eye movement after it has been planned, because any eye-movements earlier than this to the correct goal (e.g., the box) could not theoretically be due to hearing “box” (because that is not enough time to make a saccade). We analyzed a coarse-grain window from 200ms until the offset of the sentence (i.e., the onset of the action period when subjects began to make a mouse movement), with the previous-by-current-trial interaction emerging in the first 400 ms, as assessed by binning eye movements into 50-ms increments.
3Alternative analyses could address whether the experience of conflict modulates behavioral adjustments. For example, Kan et al. (2013) took an individual differences approach to demonstrate that the more perceptual ambiguity one experiences, the smaller their subsequent Stroop-conflict effect. We likewise considered testing whether individual differences in Stroop performance predicted listeners’ subsequent syntactic ambiguity resolution. However, in a separate experiment, Kan et al. interleaved a reading task with Stroop (their outcome measure): while the Stroop effect was reliably attenuated following ambiguous versus unambiguous sentences, individuals’ ambiguity effect size did not predict their carryover effect to Stroop. We therefore reasoned that a time-based analysis might be more informative than individual differences for our causal questions.
N. S. Hsu and J. M. Novick contributed equally to study concept, experimental design, data analysis and interpretation, and to writing and revising this manuscript. N. S. Hsu collected data. Both authors approved the final version of the manuscript for submission.
Access to study materials can be found here: https://osf.io/kt34q/