|Home | About | Journals | Submit | Contact Us | Français|
Kennedy and Pynte (2008) presented data that they suggested pose problems for models of eye movement control in reading in which words are encoded serially. They focus on situations in which pairs of words are fixated out of order (i.e., the first word is skipped and the second fixated prior to a regression back to the first word). We strongly disagree with their claims and contest their arguments. We argue that their data set was obtained selectively and the events they believe are problematic do not occur frequently during reading. Furthermore, we do not consider that Kennedy and Pynte’s arguments pose serious difficulties for serial models of reading such as E-Z Reader.
Kennedy and Pynte (2008) have provided some possibly useful empirical data concerning the consequences of fixating words in reading in a non-canonical order. At the level of empirical data, the information that they presented needs to be carefully considered and replicated. However, questions must be raised about the validity of their interpretation of the data, and the extent to which they really present a “challenge” to existing computational models of eye movement control in reading like E-Z Reader (Pollatsek, Reichle, & Rayner, 2006a; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Pollatsek, & Rayner, 2006; Reichle, Rayner, & Pollatsek, 2003) and SWIFT (Engbert, Nuthmann, Richter, & Kliegl, 2005). Although there are other implemented models of eye movement control in reading, we will focus on the E-Z Reader model because it is the one primarily targeted by Kennedy and Pynte (2008).
This comment will have three sections. In the first, we will try to clarify what the phenomena are that are reported by Kennedy and Pynte that they consider to be severe problems for serial processing models such as E-Z Reader. In the second, we address the question of whether these data do pose serious problems for current models of eye movement control in reading. In the final section, we discuss the intended scope of the current models and the roles that computational models and empirical data, respectively, should play in advancing understanding of eye movements in reading.
There are two key phenomena that Kennedy and Pynte point to that they consider particularly problematic for serial processing models of reading. The first emerges from an examination of the order of fixations on adjective-noun pairs in French (adjective-noun order in French is not fixed). The findings that they highlight are that (a) readers occasionally fixate these adjacent pairs of words in the reverse order of their order on the page and (b) there is apparently little or no cost in doing so. There are two important points to make here about these data. First, the probability of fixating these words in the reverse order is quite small (less than 5%). Second, fixation durations that occur when these words are fixated out of order are quite brief (well under 200 ms). Thus, as we will argue below, these appear almost to be an epiphenomenon. That is, these apparently incorrectly targeted saccades could be either due to oculomotor error or the reader occasionally being in a mindless reading situation where they “zone out” and then quickly recover (Rayner & Fisher, 1996; Schooler, Reichle, & Halpern, 2004; Vitu, O’Regan, Inhoff, & Topolski, 1995).
The second phenomenon that they give more weight to is that, in a subset of their data (in both English and French), the probability of there being a sequence in a consistent left-to-right order is quite small: only about 15%. However, this probability is taken from a fairly limited subset of the data – only about 20% of the words. Our best construal of where this 15% estimate of “inconsistent” sequences comes from is the following. First, the 20% of the data being examined are all “strings” of fixations on a sentence that contain more than 11 fixations. (Kennedy and Pynte do not offer a rationale for why they are restricting the analyses to these sequences.) This means, among other things, that this is far from a random sample of the text and instead is a selective sample of long and possibly complex sentences in which the syntax and meaning may not be clear. Second, if the sequence contains any event of the following two categories it would be counted as “inconsistent”: (a) a regression directly to a word that had not been previously fixated; or (b) “successive fixations involving saccades from wordn to wordk, where (k − n) > 2” (i.e., a forward saccade that skips more than one word).
On first blush, the fact that 85% of these sequences receive an “inconsistent” set of fixations sounds impressive. However, selecting such long sequences maximizes the probability of a deviation between the recorded eye position and the actual eye position. Moreover, Kennedy and Pynte make no mention of excluding any data. Thus, their data apparently include fixations near the beginning of a line. If so, this could easily be the source of many of these reversal “errors”, as a common fixation pattern at the beginning of a line includes regressions to words not previously fixated because the return sweep from the prior line often falls short of the beginning of the line, resulting in the initial fixation on the line being on the second or third word followed by a regression to the first word. Moreover, as we will argue below, there are other quite plausible mechanisms for such “inconsistent” fixation sequences. However, before doing so, we would like to briefly mention another data set from which a different picture emerges than from the Kennedy and Pynte data.
In Hogaboam’s (1983) corpus analysis, the frequency of all of the patterns that he reported totaled .592 of all fixation sequences1. Here, we will list both the relative frequency he reported as well as an adjusted frequency (in parentheses) based on only those patterns he reported. Hogaboam noted that .424 (.72) of the time the eyes move forward in the text. The most frequently occurring pattern was for the eyes to move from word n to word n+1; this pattern occurred with a relative frequency of .227 (.38). The next most frequently occurring pattern was a move from word n to word n+2 (wherein word n+1 was skipped) with a relative frequency of .124 (.21). These two most frequently occurring patterns were followed by cases in which a forward move was followed by a regression and cases in which a regression was followed by a forward move (with relative frequencies of .08 (.14) for both). Forward skips of more than two words, which by definition for Kennedy and Pynte yielded a non-canonical case, amounted to only .018 (.03). Furthermore, a forward saccade followed by a regression of more than two words amounted to .016 (.027) of the cases, and a regression followed by a forward move of more than two words combined to yield only .047 (.08). This latter pattern of eye movements typically reflects cases in which the reader regresses to an earlier processed word and following the regression (or sequence of regressions) when the reader starts moving forward in the text again they typically move to beyond the word from which they launched the regression (Rayner, 1998; Rayner, Juhasz, Ashby, & Clifton, 2003).
In sum, these points serve to raise some question about the severity of the problem of non-canonical fixation patterns. We do not deny that such patterns occur, but we also suspect that they are largely explainable and not the mystery suggested by Kennedy and Pynte. The next section will argue (a) that E-Z Reader, as currently implemented, can account for much of their data and (b) that much of the rest of the “inconsistent” patterns are likely due to aspects of reading that have nothing to do with the encoding of words and are beyond the scope of the E-Z Reader model and thus irrelevant to the question of whether encoding of words is sequential or in parallel.
Let us emphasize at this point that the issue that Kennedy and Pynte are trying to address is whether encoding of words in text is serial (presumably usually left-to-right in English or French texts) or a parallel encoding of two or more words at one time. Thus, phenomena such as people skipping blocks of words because the reader assumes that a sentence is just reiterating a point already made should be irrelevant to the issue (see below).
With that in mind, let’s now consider whether the phenomena discussed above are a serious problem for a serial model of encoding. Let’s consider the French adjective-noun data. Here, for successive word pairs, 5% of the time, readers first fixate the rightmost of the two and then regress to fixate the prior word. In the E-Z Reader model, such patterns are to be expected (although infrequently) in the normal process of word encoding. There are two mechanisms in the E-Z Reader model to explain such patterns. First, there is mistargeting of saccades. That is, if an intended saccade is fairly short (e.g., when the current fixation is near the end of a word and the next word is 3–5 letters), E-Z Reader predicts (consistent with an analysis of a large data set by McConkie, Kerr, Reddix, & Zola, 1988) that the actual saccades will tend to overshoot the intended location (in this case, the middle of the short word) because of a systematic range error. Moreover, due to inherent random variability in saccade targeting, the probability of the saccade skipping the word that the reader was intending to fixate is not that small. Because of this variability, E-Z Reader also predicts that the probability of programming a corrective “refixation” on the word that is the attended word but was inadvertently skipped would be reasonably high because the viewing location following inadvertent skips would be non-optimal. This case has already been discussed in our work (e.g., see Reichle, Rayner, & Pollatsek, 1999).
A second mechanism for such patterns that we have not explored in depth is implicit in the “guessing" mechanism in E-Z Reader2. That is, we assume that words are sometimes skipped because the reader guesses what the next word is because the prior context makes the identity of that word highly likely. Most of our modeling work has implicitly assumed that this guess is always correct. However, that is unlikely to be the case, and thus, some of the time such a guess will be wrong and under such circumstances a regression back to the word is likely to ensue in order to correctly identify the word. Although we have not yet attempted to model this phenomenon in detail largely because a serious attempt to model it would involve adding a quite complex theory of text comprehension to our model (and at this stage doing so would be premature), we have on two separate occasions simply assumed that some small proportion of “guessed” words are misidentified to examine how such failures might influence the simulated patterns of regressions (see Pollatsek, Juhasz, Reichle, Machacek, & Rayner, 2008; and Rayner, Reichle, Stroud, Williams, & Pollatsek, 2006). (See below for related comments on this point.)
To summarize, explaining the small fraction of such fixation reversals in the French corpus solely within the existing E-Z Reader model (i.e., at the level of word identification) seems quite easy and is thus not a “fatal bullet” for the model or its assumptions about the serial identification of words. We now turn to the other datum: the fact that only about 15% of the long sequences analyzed by Kennedy and Pynte were without an “irregular” saccade (i.e., a regression to a previously non-fixated word or a forward skip of more than one word). The first fairly obvious point to make is that having such long sequences makes it likely that, even if “irregular” saccades are rare, the probability of at least one happening in a sequence of 12 saccades would be reasonably likely. Indeed, if such “irregular” saccades were random, then if the probability of one happening was .15, then one would expect that only 15% of the sequences of 12 fixations would be free of “irregular” saccades. Nonetheless, this 15% figure is likely beyond the capability of the present E-Z Reader model to explain. However, we do not see this as a serious problem because we have not attempted to build processing of syntax and text comprehension into the model3. Kennedy and Pynte do not make clear what proportion of the “irregular” saccades are regressions to previously unfixated words and what proportion are due to large (forward) skips of the text. We will assume that neither of the two proportions was negligible and discuss them in turn.
First consider regressions to previously unfixated words. As the prior section indicates, E-Z Reader can predict such occurrences, but is unlikely to be able to predict that they will occur 15% of the time. However, additional occurrences of such regressions are easily predicted by what we already know about text comprehension. That is, failures in lexical processing (e.g., encoding the wrong meaning of a homographic word), failures in parsing a sentence correctly (“garden path” sentences being an extreme case), failures in resolution of referring expressions, in fact any interpretative difficulty will lead to regressions in the text. Such regressions, which are reasonably large, are likely to land on an apparently “random” location. Even if the failure can be diagnosed by the reader as stemming from the incorrect encoding of a particular word (as in the homographic word case), the regression that is targeted to that word is likely to fall short because there is systematic undershoot of large saccades. As a result, it is not unlikely that the regression would land on a function word (or short content word) near the content word that was the intended target; moreover, such regression words would be fairly likely to be skipped initially. However, in many such cases, the reader may just be aware that he or she has misprocessed some aspect of the sentence but may not know exactly the source of the problem. Again, under such circumstances, it would not be unlikely that a previously skipped word would be fixated by the regression (Mitchell, Shen, Green, & Hodgson, 2008; Weger & Inhoff, 2007).
Thus, it seems likely that a reasonable number of additional regressions to previously unfixated words can be explained by well-known phenomena associated with disruption to processing in the parsing and text comprehension literature. Needless to say, these phenomena have little or nothing to do with the issue of whether words are encoded serially or in parallel. However, this is just the tip of the iceberg, when discussing reading more generally, for example, say when reading a passage from a newspaper. Such passages have all sorts of intrinsic redundancies often due to a reader’s prior knowledge of the topic. This means that readers may readily skip all or part of a portion of text if they think that this is something that is just repeating what has been already said or telling them something that they already know. (Of course, they may go back and glance at the passage they skipped if they subsequently decide they were wrong about this.) Again, this has nothing to do with how words are encoded from the text during the ongoing automatic processing that occurs during normal reading. Moreover, most readers are likely to pause in such articles to consider and evaluate a point and decide whether or not they fundamentally agree or disagree with it; at this point they may also undertake cognitive processing to commit it to memory in order that it may be produced (for whatever reason) at some later point. The important point here is that these higher order cogitative processes could easily impact on the oculomotor behavior one may observe when a passage from, say, a newspaper, is being read, yet such eye movement behavior would have nothing to do with how words are encoded.
Another point is worth mentioning here. That is, we are quite unclear whether any model of reading predicts very many skips of two words due to the word encoding process. A possible exception is the skipping of two highly predictable function words such as ‘of the’ or ‘do not’. In our view, it is quite plausible that the E-Z Reader model may be able to account for such effects due to the high predictability of such words. Presumably, Kennedy and Pynte’s count of such instances go beyond this to instances when two successive content words that are not particularly short were skipped. Given that estimates of the region of text from which letter information can be extracted is about 7–8 characters to the right of fixation (Rayner, 1998), it seems quite implausible, even for a parallel model, to predict such skipping due to parallel encoding of the letters in these words. Instead, in our view, it is again much more likely that such skipping is due to predictability and redundancy. That is, if someone is reading a passage from an article that is discussing world politics and encounters United Sxxxxx xx Xxxxxxx…, they are quite likely to guess that the next words are States of America and program a large skip. (We used xs in the above to represent letters whose features may not be perceived if the reader was fixating on the beginning of ‘United’.) We think that such instances during general reading are not rare.
Thus, to summarize, the fact that regressions to previously unfixated words and skips of larger than one word occur 15% of the time are entirely consistent with a serial model of word encoding (when the reader is primarily engaged in encoding new words from the text for comprehension). Some of the time, such regressions are due to mistargeting of saccades, some of the time to comprehension failures, and some of the time to skipping over text for which the reader has made an incorrect guess as to the nature of the text and has then had to backtrack. In addition, it’s worth reiterating that this sample from which the 15% figure is derived is likely to be a non-random fifth of Kennedy and Pynte’s total data set. Indeed, the requirement that there be so many fixations on a sentence is almost certainly guarantees that they are selectively sampling from sentences with regressions.
Kennedy and Pynte make the point that such non-canonical fixation sequences have little effect on comprehension (which they take to be evidence against a serial word encoding assumption). However, most of this evidence is indirect in two senses. The first, and most obvious, sense is that they are inferring “comprehension” from a pattern of eye movements in the immediate vicinity of the “irregular” fixation sequence. However, they don’t really have a measure indicating whether or not that portion of the text, or even that passage was comprehended. Indeed, the reader may not have fully understood that portion of the passage, but instead moved on in the text hoping either that the later text would clarify the point or that it was not important. Second, given the methodology of comparing passages from these texts, it is not clear how similar the passages were that either did or didn’t have an irregular sequence of eye movements.
Finally, at the outset of their article, Kennedy and Pynte note that for the sentence “Are tourists enticed by these attractions threatening their very existence?” that the sequence of fixations by one reader was as follows: “Are tourists enticed these attractions attractions threatening BLINK threatening very their existence?” Here, we’d like to note that when reading aloud, readers do not output the words in the order that they are fixated. Likewise, in silent reading, readers do not hear their inner voice (see Rayner & Pollatsek, 1989 for discussion) saying the word salad that would result from strictly the sequence of fixations. Rather, inner speech yields an orderly sequence of words despite the order in which the eyes actually fixate the words. This is because covert attention within the context of the E-Z Reader model can be in a different place than the eyes’ location.
Although Kennedy and Pynte reserve their most telling criticisms for E-Z Reader, they are also disparaging of SWIFT. In some respects, the general attitude towards the implemented models of eye movement control in reading presented in Kennedy and Pynte are reminiscent of points raised by Kennedy (2005) in his keynote address at the European Conference on Eye Movements (ECEM13). He argued that while the models had some value, they were perhaps premature and not based on enough existing data. We strongly disagree with this assessment4. It is very clear that models like E-Z Reader and SWIFT have generated a lot of new data in attempts to validate various predictions that they make. Indeed, a great virtue of the models is that they do generate clear predictions. With respect to E-Z Reader, in our view, there are more studies which have supported basic claims of the model (see Angele, Slattery, Wang, Kliegl, & Rayner, 2008; Drieghe, Rayner, & Pollatsek, 2005, 2008; Inhoff, Greenberg, Solomon, & Wang, 2008; McDonald, 2006; Kliegl, Risse, & Laubrock, 2007; Miellet, Sparrow, & Sereno, 2007; Rayner, Juhasz, & Brown, 2006; Reingold & Rayner, 2006) than those which have purported to present problems for the model (Inhoff, Eiter, & Radach, 2005; Kennedy & Pynte, 2005; Kliegl, 2007; Kliegl, Nuthmann, & Engbert, 2006; but see Pollatsek, Reichle, & Rayner, 2006a 2006b; Rayner et al., 2007 in response). But, the point is that a considerable number of studies have examined the predictions of the model because they are clear and precise. Furthermore, in addition to generating clear predictions, it is even the case that on some occasions E-Z Reader has made it possible to explain data patterns for which explanations might otherwise remain elusive (see Pollatsek et al., 2008).
This is not to deny that empirical data that are independent of the models have no value because they clearly do. Indeed, researchers need not be obsessed with collecting data that are relevant only for adjudicating between different computational models. There are still considerable empirical data to be collected concerning the relationship between eye movements and reading. Kennedy and Pynte’s (2008) data may have value in this regard. However, as we have argued above, it is not clear how many of the phenomena they are reporting bear on the issue of how words are encoded during normal reading – as opposed to how people deal with difficulties associated with aspects of syntax and discourse, and, how they process text with a reasonable amount of redundancy, or how they skim through text that may be quite predictable. Ideally, of course, it would be nice have a quantitative model of eye movements in reading that handled all of these phenomena. However, it is clear that our current understanding of parsing does not really allow for such a computational model, and our understanding of text comprehension is in an even more rudimentary state. We have taken certain modest steps to include processing levels in E-Z Reader beyond individual word comprehension as certain data have compelled us to do so (e.g., Rayner, Ashby, Pollatsek, & Reichle, 2004). In addition, it has been modified to account for larger data domains (Rayner, Li, & Pollatsek, 2007; Rayner et al., 2006; Reichle, Warren, & McConnell, 2008). In particular, the most recent version of the model, E-Z Reader 10 (Reichle et al., 2008), is designed to account for some higher order influences on eye fixations and, as a result, it does a better job of accounting for how problems with higher-level (post-lexical) language processing results in inter-word regressions.
To summarize, we have considered the points Kennedy and Pynte argued are problematic for models of reading in which words are encoded serially. Their primary data against serial encoding comprises occasions where pairs of words are fixated out of order (i.e., the first word is skipped and the second fixated prior to a regression back to the first word). We strongly disagree with their claims and contest their arguments. To this end we made a number of points. First, they were highly selective in obtaining their data set. Second, the events that they describe do not occur particularly frequently during reading. Third, E-Z Reader can easily account for such effects for certain types of stimuli (e.g., adjacent short function words). Fourth, even for longer content word pairs it seems that there are sensible reasons why we might observe such behavior and that this would not be a problem for E-Z Reader. Specifically, this may occur when the identity of a word is incorrectly guessed or when there is redundancy or repetition in the text. In short, we do not consider the points made by Kennedy and Pynte to pose any serious difficulties for serial models of reading such as E-Z Reader.
Preparation of this article was supported by Grants HD26765 and HD053639 from the National Institute of Health and by Leverhulme Trust Grants F00180W and F00180V. We thank Reinhold Kliegl and an anonymous reviewer for their comments on an earlier version.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1It is not clear what happened to the other forty percent of the sequences. Hogaboam does note that 10.4% of the fixations were excluded because they were data disturbances or unclassifiable patterns (almost all of which were blinks). But, this still leaves a roughly 30% of the data unaccounted for. Therefore, the adjusted values that we present represent the percentage of sequences based on only the classifiable sequences in his description.
2By using the term “guessing”, we do not mean to imply that readers make conscious guesses or predictions of what words are coming next. Rather, we are referring to situations in which the context is highly constraining for a given word. In addition, in such situations readers utilize partial information (such as beginning letters and word length) to inform their “guess”.
3Because of this limitation, our modeling work has generally made no attempt to simulate inter-word regressions. We have therefore excluded trials containing regressions from the sentence corpus used in our simulations. Because these sentences were fairly long (8–14 words; Schilling, Rayner, & Chumbley, 1998), it is perhaps not surprising that most (64%) contained one or more regressions. However, the majority of saccades (~85%) were still in their correct canonical order. And more importantly, even cases involving inter-word regressions say nothing about whether the words are encoded in the canonical order—such cases may reflect problems with high-level language processing and have nothing to do with word encoding per se.
4The points made in this section generally follow from Rayner (2007) in his keynote address at ECEM14.