|Home | About | Journals | Submit | Contact Us | Français|
Word frequency and semantic priming effects are among the most robust effects in visual word recognition, and it has been generally assumed that these two variables produce interactive effects in lexical decision performance, with larger priming effects for low-frequency targets. The results from four lexical decision experiments indicate that the joint effects of semantic priming and word frequency are critically dependent upon differences in the vocabulary knowledge of the participants. Specifically, across two Universities, additive effects of the two variables were observed in participants with more vocabulary knowledge, while interactive effects were observed in participants with less vocabulary knowledge. These results are discussed with reference to Borowsky and Besner’s (1993) multistage account and Plaut and Booth’s (2000) single-mechanism model. In general, the findings are also consistent with a flexible lexical processing system that optimizes performance based on processing fluency and task demands.
Word frequency and semantic priming effects are probably the most studied effects in the visual word recognition literature. Frequently encountered words are recognized faster than rarely encountered words. Targets preceded by related primes (e.g., BREAD-BUTTER) are recognized faster than targets preceded by unrelated primes (e.g., DOCTOR-BUTTER). Importantly, on a theoretical level, the two variables interact, with larger semantic priming effects for low-frequency targets than for high-frequency targets (Becker, 1979; Borowsky & Besner, 1993; Plaut & Booth, 2000).
Different mechanisms have been proposed to account for the priming by frequency interaction (see McNamara, 2005, and Neely, 1991, for excellent reviews). We will consider two major perspectives on how the interaction could be accommodated. The first assumes that the word recognition process is best conceptualized as separate, serially organized processing stages and the second assumes that word recognition reflects the operation of a single mechanism within a parallel distributed processing (PDP) network.
The serially organized stage framework is predicated on additive factors logic (Sternberg, 1969), which proposes that an interaction between two variables signifies that the two variables influence at least one common stage, while additive effects (i.e., two main effects and no interaction) is more likely to indicate that the variables influence different stages. Empirically, priming interacts with stimulus quality (e.g., Stolz & Neely, 1995) and with word frequency (e.g., Stone & Van Orden, 1993), but stimulus quality and word frequency produce robust additive effects in lexical decision (Becker & Killion, 1977; Plourde & Besner, 1997; Yap & Balota, 2007). This complex pattern of data has been interpreted within a multistage model (see Figure 1 for an example), where stimulus quality affects the first stage, word frequency affects the second one (i.e., semantic system), and both stages are sensitive to semantic priming. Specifically, in Borowsky and Besner’s (1993) multistage activation model, words are first “cleaned up” before they are processed by a second stage that is sensitive to word frequency. The priming by frequency interaction implies that these two variables jointly influence the second stage.
Importantly, according to the multistage perspective, rather than influencing the word detectors in the orthographic input lexicon, word frequency is postulated to modulate the mappings between the orthographic input lexicon and the semantic system. That is, high-frequency words possess more efficient mappings between the orthographic input lexicon and semantic system, and therefore, evidence for such words accumulates more rapidly than for low-frequency words. A related semantic context lowers the recognition threshold, and for any given change in criterion, larger priming effects will be observed for low-frequency targets (slower activation rate) than for high-frequency targets (faster activation rate) (see Figure 1’s right panel). This predicts the observed priming by frequency interaction. Of course, a central tenet in this model is that activation in the semantic system, not the orthographic input lexicon, drives lexical decisions (see Borowsky & Besner, p. 833, for further discussion of this assumption). While this and other assumptions in the multistage model may seem post hoc, Borowsky and Besner have argued that these assumptions are necessary, given the complex joint effects of priming, frequency, and stimulus quality described earlier.
Plaut and Booth’s (2000) PDP account of the combinatorial influence of these variables provides an important alternative. Unlike the stage-based accounts, which incorporate thresholded processing and multiple stages, Plaut and Booth’s model accounts for the priming by frequency interaction and other empirical effects in semantic priming with a single mechanism that mediates input and output processes. Specifically, a non-linear sigmoid mapping between input and output allows equal differences in input to be reflected by equal or unequal differences in the output, depending on the portion of the sigmoid function being examined (see Figure 2). Because high-frequency and related targets possess higher input strengths (i.e., they are located higher on the input continuum), high-frequency targets yield smaller priming effects than low-frequency targets.
There is an ongoing debate about whether the joint effects of priming, word frequency, and stimulus quality are better accommodated by a multistage mechanism or by a single mechanism. Borowsky and Besner (2006) have argued that the single-mechanism model has problems discriminating words from orthographically matched nonwords, that its reliance on semantics for carrying out lexical decision is inconsistent with neuropsychological evidence, and that it is unable to accommodate additive and interactive effects within the same range of RTs typically observed (see also Besner & Borowsky, 2006; Besner, Wartak, & Robidoux, 2008). In response to these criticisms, Plaut and Booth (2006), after carrying out additional modeling, argued that none of these issues are truly problematic for their model. For example, they demonstrated that the model could distinguish consonant-vowel-consonant (CVC) words from CVC nonwords very accurately. The full details of this debate are outside the scope of this paper, but it seems reasonable to conclude that there is currently no consensus on whether a multistage mechanism or a single mechanism better accommodates the extant data. We will revisit this debate in the General Discussion, when we evaluate our data against the two classes of models.
Although it has traditionally been assumed that priming and frequency produce interactive effects, Plaut and Booth (2000) have demonstrated that these two factors do not always interact. Specifically, they observed interactive effects of priming and frequency in participants with high perceptual ability, and additive effects in participants with low perceptual ability, as measured by a standard psychometric test of processing speed called the Symbol Search Test of the Wechsler Intelligence Scale for Children (Wechsler, 1991). In this paper-and-pencil matching to sample task, participants are required to indicate, as quickly and accurately as possible, whether either of two meaningless symbols on the left is present in a row of five meaningless symbols on the right. Plaut and Booth decided to focus on perceptual ability because of its links to reading proficiency (Vernon, 1987), abnormal language development (Farmer & Klein, 1995), and early reading acquisition (Detterman & Daniel, 1989).
Plaut and Booth further demonstrated that their single-mechanism model could parsimoniously yield the same three-way interaction between perceptual ability, word frequency, and semantic priming, via the non-linear sigmoid activation function (see Figure 2). For low-perceptual-ability readers, located on the left-hand portion of the graph, the input-output relationship is relatively linear, producing equal-sized priming effects for low- and high-frequency targets. In contrast, for high-perceptual-ability readers, located on the right-hand portion of the graph, the input-output function is more logarithmic in shape, and this yields larger priming effects for low- than for high-frequency targets.
The present study extends the work by Plaut and Booth (2000) by considering the role of lexical integrity on the effects of priming and frequency. By lexical integrity, we are referring to the strength and quality of the underlying lexical representations. Lexical integrity is conceptually very similar to Perfetti and Hart’s (2002) lexical quality, where “quality” is defined by fully specified orthographic representations and fully redundant phonological representations. High integrity representations, due to their coherence and stability, are more likely to be retrieved in a fluent manner. How might lexical integrity modulate the joint effects of priming and frequency? If one assumes that the flow of activation in word recognition is fundamentally an interactive process, low-frequency words, compared to high-frequency words, should have more opportunity to benefit from a related semantic context, since they are further from recognition threshold. For individuals with relatively rich lexical representations (high lexical integrity), one a priori assumes that for the same word these individuals would be further from recognition threshold than individuals with relatively poor lexical representations (low lexical integrity). Specifically, a medium-frequency word for a high-lexical-integrity individual is likely to be a low-frequency word for a low-lexical-integrity individual. So, one might actually expect individuals with lower integrity representations to show a larger influence of semantic context than those with higher integrity representations, all other things being equal. We will henceforth refer to this position as the lexical integrity hypothesis.
In our study, lexical integrity is assessed by vocabulary knowledge (i.e., knowledge of word forms and word meanings). There is evidence that the size of an individual’s vocabulary is positively related to the precision (Perfetti, 2007; Perfetti & Hart, 2002; Verhoeven & Van Leeuwe, 2008) and stability (Kinoshita, 2006; Kinoshita & Mozer, 2006; Paap, Johansen, Chun, & Vonnahme, 2000) of underlying lexical representations. Lexical integrity contrasts well with Plaut and Booth’s (2000) perceptual ability, which primarily has to do with an individual’s speed at processing new information (Tulsky, Saklofske, & Zhu, 2003), and implicates lower-level processes that encode not only letters and words, but also digits, pictures, and objects. Indeed, in Plaut and Booth’s sample, symbol search performance was uncorrelated with vocabulary knowledge (r = .09), as measured by the Peabody Picture Vocabulary Test (Dunn & Dunn, 1981), indicating that these two instruments tap distinct abilities.
Before discussing the specific predictions of the lexical integrity hypothesis, some other findings are relevant. When Tainturier, Tremblay, and Lecours (1992) examined the relationship between educational level (and by extension, vocabulary knowledge) and the magnitude of word-frequency effects in lexical decision, they found that frequency effects were smaller for more educated participants. This indicates that processing differences between low- and high-frequency words are smaller for the readers with (presumably) more vocabulary knowledge. Seidenberg (1985) also reported that rapid decoders produced smaller frequency effects than slow decoders in speeded pronunciation, a pattern that has been replicated by Schilling, Rayner, and Chumbley (1998). These results would appear to predict that the effects of word frequency and semantic priming should be additive for high-lexical-integrity readers and interactive for low-lexical-integrity readers. Specifically, for low-integrity readers, low-frequency words, compared to high-frequency words, are less strongly represented and are processed more effortfully; low-frequency words should therefore benefit more from a related prime. For high-lexical-integrity readers, low-frequency words are so well represented that high- and low-frequency words are processed very efficiently, such that both classes of words will benefit to the same extent from a related prime.
To summarize, in the present study, we explore the joint effects of priming, frequency, and vocabulary knowledge. As discussed, interactive effects of priming and frequency should be associated with readers with less lexical integrity, i.e., less vocabulary knowledge. The individual differences issue, a major theme in this paper, seems timely given researchers’ growing interest in the effects of individual differences on semantic priming. For example, Hutchison (2007) examined the role of attentional control and the relatedness proportion effect in semantic priming. As the proportion of related prime-target pairs in an experiment increases, priming effects become larger. This relatedness proportion effect reflects participants’ effortful generation of likely targets when they encounter a prime. Interestingly, Hutchison reported a positive linear relationship between participants’ attentional control and the magnitude of their relatedness proportion effects, suggesting that individual differences in attentional control modulate strategic processes in semantic priming.
In addition to individual differences, the present study explores the characteristics of response time (RT) distributions to better understand the nature of the interactive effects of these variables. While the joint effects of frequency and priming place important constraints on models of word recognition and priming, these effects are not well-understood at the level of underlying RT distributions. Although mean RTs are faster for semantically related targets than for unrelated targets, differences in mean RTs can be reflected by distributional shifting, skewing, or a mixture of shifting and skewing (Balota & Spieler, 1999; Heathcote, Popiel, & Mewhort, 1991). A recent study reported that semantic priming effects are reflected by distributional shifting (Balota, Yap, Cortese, & Watson, 2008; see Roelofs, 2008, for a discussion of distributional effects in priming for semantic categorization). However, it is still unclear if this shifting applies only to targets with high integrity lexical representations (i.e., high-frequency words) or to targets in general (i.e., high- and low-frequency words). In contrast, word frequency effects are consistently reflected by both a shifting and skewing of the RT distributions (e.g., Andrews & Heathcote, 2001; Balota & Spieler; Yap & Balota, 2007). Because understanding the joint effects of priming and frequency at the distributional level will help impose finer constraints on extant models, the second major theme of the current study is to explore whether or not the priming effects for high- and low-frequency words show qualitatively similar RT distributional profiles across individuals with higher and lower levels of lexical knowledge.
Currently, distributional analyses can be carried out by fitting RTs to a theoretical distribution like the ex-Gaussian distribution (see Van Zandt, 2000, for a discussion of RT distributional analyses), or by averaging RT distributions across a number of participants. In this paper, both techniques are employed. Fitting individual raw RT data to the ex-Gaussian distribution, a three-parameter (μ, σ, τ) function, allows differences in means to be partitioned into distributional shifting (μ) and distributional skewing (τ); importantly, the algebraic sum of μ and τ is the mean of the fitted ex-Gaussian distribution. Vincentizing is a non-parametric technique which computes a number of vincentiles for each participant, where a vincentile is defined as the mean of observations between neighboring percentiles. For example, to obtain 10 vincentiles, the RT data within each condition for a participant is first sorted (from fastest to slowest responses), and the first 10% of the data is then averaged, followed by the second 10%, and so on. Individual vincentiles are then averaged across participants. Vincentizing makes no assumptions about the shape of the underlying RT distribution and examines the raw data directly.
In a factorial experiment manipulating priming and frequency, it is critical that low- and high-frequency targets are equally related to their related primes. Interestingly, in virtually every published study examining the priming by frequency interaction, the associative strength of high-and low-frequency targets were matched using a rating procedure. For example, Becker (1979) presented participants with word pairs, and asked them to indicate, on a seven-point scale, the likelihood of generating the second word, given the first word. In the current study, we used Nelson, McEvoy, and Schreiber’s (2004) free association norms to select primes (see also Tse & Neely, 2007). This approach possesses two major advantages. One, rating two items (e.g., A and B) as highly related does not indicate if there is a strong A to B connection, or a strong B to A connection (Nelson et al). Free association norms allow both the magnitude and direction of associations to be taken into account. More importantly, the Nelson et al. norms, which are based on the responses of more than 6000 participants, should provide more reliable estimates of associative strength, compared to rating norms based on relatively small samples of participants.
Nonword type can be manipulated in order to modulate word-nonword discrimination difficulty. Using pseudohomophones (e.g, BRANE), compared to legal nonwords (e.g., FLIRP), increases the similarity between words and nonwords, which yields slower lexical decision RTs and larger word frequency effects (Stone & Van Orden, 1993; Yap, Balota, Cortese, & Watson, 2006). Essentially, pseudohomophones make it more difficult to discriminate between words and nonwords, causing evidence to accumulate more slowly, which in turn exaggerates the magnitude of effects. In this study, we also manipulated nonword type, with the a priori prediction that priming and frequency effects, along with the priming by frequency interaction would increase in the context of pseudohomophones compared to pronounceable nonwords1. This would provide additional leverage for our exploration of the influence of these variables across different levels of vocabulary knowledge.
Word frequency and semantic priming were factorially manipulated in four lexical decision experiments, and effects were analyzed both at the level of the mean and at the level of RT distributional characteristics. Experiments 1 and 3 (E1 and E3) featured legal nonwords (i.e., orthographically and phonologically plausible, e.g., FLIRP), while Experiments 2 and 4 (E2 and E4) featured pseudohomophonic nonwords (i.e., sound like real words, e.g., BRANE). Note that E3 and E4 were literal replications of E1 and E2 respectively with an independent pool of participants, with varying levels of vocabulary knowledge. Recruiting participant pools from different universities allowed us to test the way in which individual differences modulate the priming by frequency interaction. As a preview, participants in E1 and E2 were associated with faster, more accurate word recognition performance, and higher vocabulary scores than those in E3 and E4. Given that college students in general are already selected for their vocabulary knowledge, this implies that the empirical patterns observed in E3 and E4 are more representative of typical readers, while participants in E1 and E2 are more likely to represent individuals with very high vocabulary knowledge.
One hundred and fifty-six undergraduates participated in the four experiments for course credit or $5 (see Table 1 for a summary of participant characteristics). All participants had normal or corrected-to-normal vision and were recruited from participant pools at the Washington University (WUSTL, E1 & E2) and the University at Albany, State University of New York (SUNY-A, E3 & E4). Collapsing across experiments, participants from the two universities were significantly different in years of education, t(144) = 5.91, ηp2 = .20, in vocabulary scores, t(144) = 7.68, ηp2 = .29, but not in age, t < 1; the WUSTL participants had more years of education and higher vocabulary scores than the SUNY-A participants.
In each experiment, Priming (related or unrelated) and Frequency (high or low) were manipulated within subjects. Across experiments, Nonword Type (legal in E1 and E3 or pseudohomophonic in E2 and E4) and University (WUSTL in E1 and E2 or SUNY-A in E3 and E4) were manipulated between subjects. The dependent variables were RT and error rate.
Descriptive statistics for the word and nonword stimuli are presented in Table 2. High and low-frequency targets were matched on length and orthographic neighborhood size (Coltheart, Davelaar, Jonasson, & Besner, 1977). Primes were selected using the Nelson et al. (2004) free association norms, and prime-target associative strengths were matched for high and low-frequency targets in both directions (prime-to-target and target-to-prime). Nonwords were orthographically legal and pronounceable in E1 and E3, and pseudohomophonic in E2 and E4. Word and nonword length were matched across all four experiments, while word (M = 5.23, SD = 4.74) and nonword (M = 4.70, SD = 3.96) orthographic neighborhood sizes were matched in E1 and E3. In E2 and E4, the mean orthographic neighborhood size of pseudohomophones (M = 3.68, SD = 3.66) was significantly lower than that of words, p < .0012. Of course, orthographic neighborhood size does not reflect the effect of orthographic neighbors per se but may also reflect the influence of phonological neighbors (i.e., neighbors obtained by substituting a single phoneme). As Mulatti, Reynolds, and Besner (2006) have pointed out, orthographic and phonological neighborhood sizes are highly correlated; for the 300 words in the present study, this correlation was .729. Overall, there were 150 high-frequency words, 150 low-frequency words, and 300 nonwords. Within each frequency range, targets were either primed by a related or unrelated prime per participant, resulting in 75 observations per participant cell. Four counterbalancing lists were created, each of which was randomly and equally assigned to 10, 12, 10, and 7 participants in E1 to E4, respectively. No item was repeated within a participant.
PC-compatible computers running E-prime software (Schneider, Eschman, & Zuccolotto, 2001) were used to control stimulus presentation and to collect data. All stimuli were displayed at the center of the computer screen and participants’ responses were made on a computer keyboard. Participants were tested individually in sound-attenuated cubicles, sitting about 60 cm from the screen. Participants first provided demographic information (chronological age, years of education) and completed the vocabulary subscale (40 item vocabulary test) of the Shipley Institute of Living Scale (Shipley, 1940; Zachary, 1992) on the computer. The full Shipley scale was originally devised to provide a quick measure of intellectual functioning, and contains vocabulary knowledge (reliability coefficient = .87) and abstract thinking (reliability coefficient = .89) subscales. In our study, we administered the vocabulary subscale, and participants’ vocabulary knowledge was estimated using raw Shipley scores. Despite its age, the Shipley continues to be widely used by researchers and it has been shown to correlate highly with most standard intelligence tests (see Zachary, Paulson, & Gorsuch, 1985, for a review).
Participants were then instructed, on each trial, to silently read the first word, and to then decide whether the subsequently presented letter string formed a word or nonword by making the appropriate button press. Participants were encouraged to respond quickly, but not at the expense of accuracy. Twenty practice trials were then presented, followed by 6 experimental blocks of 100 trials, with mandatory breaks between blocks. The order in which stimuli were presented was randomized anew for each participant. Stimuli were presented in uppercase 14 point Courier, and each trial consisted of the following order of events: (a) a fixation point (+) at the center of the monitor for 2000 ms, (b) the prime for 150 ms, (c) a blank screen for 650 ms, and (d) the target. (Thus, the prime-target SOA was 800 ms.) The target remained on the screen for 3000 ms (i.e., response deadline) or until a response was made. Participants made their lexical decisions by pressing the apostrophe key for words and the A key for nonwords. Each correct response was followed by a 450 ms delay (i.e., intertrial interval). If a response was incorrect, a 170 ms tone was presented simultaneously with the onset of a 450 ms presentation of the word “Incorrect” (displayed slightly below the fixation point).
For all experiments, errors and RTs faster than 200 ms or slower than 3000 ms were first excluded, and the overall mean and standard deviation of each participant’s word and nonword RTs were then computed. The overall error rates were 5.3%, 6.3%, 6.6%, and 9.3% in E1 to E4, respectively. Of the remaining responses, any RTs 2.5 SDs above or below each participant’s respective mean (across all conditions) were removed. The percentage of correct responses that were eliminated due to being designated as an outlier were 2.6%, 2.9%, 2.8%, and 2.8% in E1 to E4, respectively. In general, outlier rates were slightly higher for unrelated (M = 2.8%) than for related (M = 2.1%) targets, but the relative difference between related and unrelated conditions for high- and low-frequency targets was relatively stable across the four experiments.
To perform the distributional analyses, ex-Gaussian parameters (μ, σ, τ) were estimated for each participant across the different experimental conditions, using the quantile maximum likelihood estimation procedure in QMPE 2.18 (Cousineau, Brown, & Heathcote, 2004; Heathcote, Brown, & Mewhort, 2002). This procedure provides unbiased parameter estimates and has been demonstrated to be more effective than continuous maximum likelihood estimation for small samples (Heathcote & Brown, 2004; Speckman & Rouder, 2004). Mean vincentiles for the data were also plotted, providing a graphical complement to the ex-Gaussian fits. As discussed in the Introduction, vincentizing averages RT distributions across participants (Andrews & Heathcote, 2001; Ratcliff, 1979; Rouder & Speckman, 2004; Vincent, 1912) to produce the RT distribution for a typical participant. Note, for each plot, that empirical vincentiles are represented by data points and standard error bars, while the vincentiles for the respective best-fitting ex-Gaussian distribution are represented by lines. The theoretical vincentiles were computed by line search on the numerical integral of the fitted ex-Gaussian distribution (A. Heathcote, personal communication, 5 Jan 2009). The goodness of fit between the empirical and theoretical vincentiles reflects the extent to which the empirical RT distributions are being captured by the ex-Gaussian parameters.
The mean RTs, error rates, and ex-Gaussian parameters of the RT data were submitted to a Priming (related or unrelated) × Frequency (high or low) repeated-measures analysis of variance (ANOVA), with participants treated as random effects. For mean RTs and error rates, an ANOVA was also conducted, with items treated as random effects. We will first consider the results for each experiment. This will be followed by cross-experiment analyses which include Nonword Type (legal or pseudohomophonic) and University (SUNY-A or WUSTL) as between-subject variables.
The mean RTs, error rates, and ex-Gaussian parameters are displayed in Table 3. The test statistics for the omnibus ANOVA by participants and by items are presented in Table 4. Importantly, there were large main effects of priming and frequency, but these two variables did not interact in the RT data. It is also interesting to note that the priming effect was fully mediated by μ (distributional shifting). This implies that semantic priming was mediated purely by distributional shifting (replicating the pattern reported by Balota et al., 2008), and this shift was of similar magnitude for high- and low-frequency targets. It is also noteworthy that there was evidence of a priming by frequency interaction in error rates (in both participant and item analyses), with larger priming effects for low-frequency targets than for high-frequency targets.
The vincentile plots provide converging support for distributional shifting as a function of prime relatedness. As can be seen in Figure 3, the priming effects for both high- and low-frequency targets were approximately the same size and constant across the vincentiles. To further explore the reliability of this pattern, we conducted an ANOVA with Vincentile as a within-subject variable3. This revealed that neither the Priming × Vincentile (p = .88) nor the Priming × Frequency × Vincentile interaction (p = .55) approached significance, confirming that priming is relatively invariant across the RT distribution (i.e., reflecting a simple distributional shift).
The mean RTs, error rates, and ex-Gaussian parameters are displayed in Table 5. The test statistics for the omnibus ANOVA by participants and by items are presented in Table 6. Again, there were clear and large additive effects of Priming and Frequency in RTs and μ. There was also a Priming × Frequency interaction in error rates, with larger priming effects for low-frequency targets. In two experiments, additive effects of priming and frequency were observed in RTs, but interactive effects of the two variables were observed in accuracy rates. This is an interesting pattern that will be discussed in greater detail in the General Discussion. Finally, like E1, priming effects were fully mediated by μ (i.e., distributional shifting), and were qualitatively similar for low- and high-frequency targets.
The vincentile plots (see Figure 4) show that the priming effect was relatively constant across the RT distribution, and of the same magnitude for high- and low-frequency targets. Neither the Priming × Vincentile (p = .82) nor the Priming × Frequency × Vincentile interaction (p = .46) was significant.
To examine the effect of nonword type on the effects of word frequency and semantic priming, we included Nonword Type as a between-subjects factor and submitted the mean RTs, error rates, and ex-Gaussian parameters to a Priming × Frequency × Nonword Type (legal or pseudohomophonic) mixed-factor ANOVA. Only interaction effects with Nonword Type are reported.
For mean RTs, the Priming × Nonword Type interaction did not reach statistical significance [F (1, 86) = 2.68, MSE = 784.46, p = .11, ηp2= .03]; the priming effect was 11 ± 13 ms numerically smaller when nonwords were pseudohomophones than when they were legal. The Frequency × Nonword Type interaction was significant [F (1, 86) = 9.21, MSE = 561.91, p < .01, ηp2= .10]; the word frequency effect was 16 ± 10 ms larger when nonwords were pseudohomophones. Neither the main effect of Nonword Type nor the 3-way interaction was significant, all Fs < 2.35, p > .12. For error rates, the Nonword Type main effect was not significant and Nonword Type also did not interact with any other variable, all Fs < 1.
For μ, the Nonword Type main effect was not significant but the Priming × Nonword Type interaction was marginally significant [F (1, 86) = 3.52, MSE = 1107.33, p = .06, ηp2= .04]. Interestingly, the priming effect in μ was 14 ± 15 ms smaller when nonwords were pseudohomophones. Neither the main effect of Nonword Type nor any other interaction was significant, all Fs < 1.92, p > .17. Turning to σ, the Nonword Type main effect was not significant and Nonword Type did not interact with any other variable, all Fs < 1.16, p > .28. Finally, for τ, the main effect of Nonword Type approached significance [F (1, 86) = 3.41, MSE = 27176.18, p = .07, ηp2= .04], but this was qualified by the marginally significant Frequency × Nonword Type interaction [F (1, 86) = 3.42, MSE = 839.26, p = .07, ηp2= .04]. The word frequency effect was 12 ± 13 ms larger in τ when nonwords were pseudohomophones. Nonword Type did not interact with any other variables, all Fs < 1. In summary, when pseudohomophones were used as nonwords, word frequency effects became larger in the tail of the RT distribution, but priming effects became smaller in the modal portion of the RT distribution.
The mean RTs, error rates, and ex-Gaussian parameters are displayed in Table 7. The test statistics for the omnibus ANOVA by participants and by items are presented in Table 8. The main effects of Priming and Frequency were highly reliable in RTs, error rates, and μ. More importantly, in the participants analyses, the Priming × Frequency interaction was significant (or approaching significance) in RTs, error rates, μ, and σ. (For the item analyses, the interaction was significant in error rates, and approached significance in RTs). In general, there was greater priming for low-frequency targets than for high-frequency targets in these measures. Moreover, the results from the ex-Gaussian analyses indicated that the priming for high-frequency targets (40 ms) was mediated largely by shifting (32 ms). In contrast, priming for low-frequency targets (65 ms) was mediated by shifting (63 ms) and σ (18 ms), but not by skewing. In other words, the larger priming effect for low-frequency targets was driven by both μ (distributional shifting) and σ (greater variability in the modal RTs).
The results from the ex-Gaussian analyses are broadly consistent with the vincentile plots (see Figure 5). Although the Priming × Frequency × Vincentile interaction was not significant (p = .34), this may have been due to noise in the final two vincentiles. When we restricted our analyses to the first eight vincentiles, the three-way interaction was significant, p = .036, indicating that across the RT distribution, priming effects were constant in magnitude for high-frequency targets, but increasing (due to greater variability of the modal RTs) for low-frequency targets. Despite the post hoc nature of this analysis, it is noteworthy that the three-way interaction holds for the first eight vincentiles, which constitute a clear majority of the dataset. In marked contrast, reexamining the Priming × Frequency × Vincentile in the first two experiments, using only the first eight vincentiles, yielded non-significant interactions for both E1 and E2, Fs < 1.
The mean RTs, error rates, and ex-Gaussian parameters are displayed in Table 9. The test statistics for the omnibus ANOVA by participants and by items are presented in Table 10. Main effects of Priming and Frequency were observed in RTs, error rates, μ, and τ. There was also a Priming × Frequency interaction in RTs, σ, and τ, in both participant and, where applicable, item analyses. In all of these measures except σ, priming effects were larger for low-frequency targets than for high-frequency targets. In E4, the interaction between Frequency and Priming occurred in σ and τ, but not in μ, in contrast to the findings in E3 (legal nonwords), where the interaction was mediated by μ and σ. Specifically, priming for high-frequency targets (37 ms) reflected mainly distributional shifting (26 ms). Low-frequency targets yielded the opposite pattern, where priming (65 ms) reflected some shifting (24 ms) but mostly skewing (42 ms). As a result, the Priming × Frequency interaction in mean RTs observed in the SUNY-A students can be attributed to the slow RTs in the tail of the distribution.
These trends are compatible with the vincentile plots (see Figure 6), which confirm that the Priming × Frequency interaction was indeed largest in the slowest vincentiles. The Priming × Frequency × Vincentile interaction was significant (p = .006), and follow-up analyses revealed that the Priming × Vincentile interaction was significant for low-frequency targets (p < .001), but not for high-frequency targets (p = .18), reinforcing the idea that priming for high-frequency words was mediated mainly by distributional shifting, but priming for low-frequency words was mediated by a mixture of skewing and shifting.
In order to directly investigate the influence of nonword type, the data from Experiments 3 and 4 were combined. The mean RTs, error rates, and ex-Gaussian parameters were submitted to a Priming × Frequency × Nonword Type mixed-factor ANOVA. Only the effects associated with Nonword Type are reported.
For mean RTs, the Frequency × Nonword Type interaction was significant [F (1, 66) = 6.61, MSE = 1739.45, p < .05, ηp2 = .09]; the word frequency effect was 27 ± 21 ms greater when nonwords were pseudohomophones. Neither the main effect of Nonword Type nor other interactions associated with Nonword Type were significant, all Fs < 1.96, p > .16. For error rates, the main effect of Nonword Type was significant [F (1, 66) = 4.84, MSE = 64.28, p < .05, ηp2 = .07]; the error rate was 2.2 ± 2.0% higher when nonwords were pseudohomophones. The marginally significant Frequency × Nonword Type interaction [F (1, 66) = 3.82, MSE = 16.16, p = .06, ηp2 = .06] further showed that the word frequency effect was 2.0 ± 2.0% larger when nonwords were pseudohomophones. Neither the main effect nor other interactions associated with Nonword Type approached significance, all Fs < 1.
Consistent with the results from the Washington University sample, for μ, the Priming × Nonword Type interaction was significant [F (1, 66) = 8.44, MSE = 964.74, p < .01, ηp2 = .11]; priming effects were 23 ± 16 ms smaller when nonwords were pseudohomophones. In addition, there was a significant Priming × Nonword Type × Frequency interaction [F (1, 66) = 5.13, MSE = 850.11, p < .05, ηp2 = .07], which was due to Priming and Frequency yielding interactive effects in the legal nonword condition and additive effects in the pseudohomophone condition. The main effect of Nonword Type was not significant nor did it interact with any other variable, all Fs < 1. Turning to σ, the Priming × Nonword Type × Frequency interaction was significant [F (1, 66) = 6.37, MSE = 881.52, p < .05, ηp2 = .09]. In the legal nonword condition, priming effects were 22.4 ± 22.8 ms larger in σ for low-frequency targets than for high-frequency targets. However, in the pseudohomophone condition, the pattern was opposite, with 15 ± 14 ms larger priming effects in σ for high-frequency than low-frequency targets. The main effect of Nonword Type was not significant nor did it interact with any other variable, all Fs < 1.60, p > .21. Finally, for τ, the main effect of Nonword Type [F (1, 66) = 4.64, MSE = 34286.76, p < .05, ηp2 = .07] was significant; τ was 50 ± 46 ms larger when nonwords were pseudohomophones. Both the Priming × Nonword Type [F (1, 66) = 9.20, MSE = 934.40, p < .01, ηp2 = .12] and the Frequency × Nonword Type interactions [F (1, 66) = 6.12, MSE = 1716.94, p < .05, ηp2 = .09] were significant. In the presence of pseudohomophones (compared to legal nonwords), priming effects were 23 ± 15 ms larger and word frequency effects were 25 ± 20 ms larger. The Priming × Nonword Type × Frequency interaction was also significant [F (1, 66) = 4.12, MSE = 1125.43, p < .05, ηp2 = .059]. The three-way interaction in τ was due to Priming and Frequency yielding interactive effects (i.e., larger priming effects for low-frequency words) only when nonwords were pseudohomophones. To summarize, when pseudohomophones were used as nonwords, priming effects became smaller in the modal portion of the RT distribution while word frequency effects became larger in the tail of the RT distribution, replicating the trends observed in the first two experiments. In addition, for E3 and E4, one also observes larger priming effects and a Priming × Frequency interaction in the tail of the distribution when pseudohomophones were used.
As shown in Table 1, SUNY-A participants and WUSTL participants differed in their years of education and vocabulary knowledge. To verify if University modulates the Frequency × Priming interaction in our experiments, we included the data from all four experiments and submitted the mean RTs, error rates, and ex-Gaussian parameters to a Priming × Frequency × Nonword Type × University (WUSTL or SUNY-A) mixed-factor ANOVA4. Only the effects associated with University are reported.
For mean RTs, the main effect of University was significant [F (1, 152) = 19.22, MSE = 60002.91, p < .01, ηp2 = .11]; the WUSTL participants were 88 ± 39 ms faster than the SUNY-A participants. The Priming × University interaction was significant [F (1, 152) = 7.19, MSE = 959.73, p < .01, ηp2 = .05]; priming effects were 14 ± 10 ms larger in SUNY-A. Importantly, the three-way interaction was also significant [F (1, 152) = 5.11, MSE = 663.36, p < .05, ηp2 = .03]; the Priming × Frequency interaction (i.e., greater priming for low-frequency targets) was 20 ± 17 ms larger for SUNY-A participants than for WUSTL participants. None of the other interactions associated with University was significant, all Fs < 2.45, ps > .12. For error rates, neither the main effect of Nonword Type nor any interaction associated with University approached significance, all Fs < 2.69, ps > .10.
For μ, the main effect of University and the University × Frequency interaction were significant [F (1, 152) = 11.58, MSE = 14462.20, p < .01, ηp2 = .07 and F (1, 152) = 6.91, MSE = 758.31, p < .01, ηp2 = .04 respectively]. More importantly, for μ, the University × Priming × Nonword Type × Frequency interaction was also significant [F (1, 152) = 4.05, MSE = 708.31, p < .05, ηp2= .03]. Although WUSTL participants produced additive effects of Frequency and Priming for both legal nonwords and pseudohomophones, SUNY-A participants produced additive effects for pseudohomophones but interactive effects for legal nonwords. No other interaction associated with University was significant, all Fs < 1. Turning to σ, the main effect of University was significant [F (1, 152) = 6.88, MSE = 1587.45, p < .01, ηp2 = .04]. Like μ, the University × Priming × Nonword Type × Frequency interaction was also significant [F (1, 152) = 5.25, MSE = 623.86, p < .05, ηp2= .03]. This four-way interaction was driven by two opposing Priming × Nonword Type × Frequency interactive effects in the SUNY-A participants, discussed above in the analyses for Experiments 3 and 4. None of the other interactions associated with University was significant, all Fs < 2.00, ps > .16. Finally, for τ, the main effect of University and the University × Priming interaction were significant [F (1, 152) = 14.14, MSE = 30265.47, p < .01, ηp2= .09 and F (1, 152) = 7.72, MSE = 1025.10, p < .01, ηp2= .05 respectively]. Importantly, the University × Priming × Nonword Type × Frequency interaction was also significant [F (1, 152) = 4.02, MSE = 831.61, p < .05, ηp2= .03]. This four-way interaction was driven by SUNY-A participants producing a Priming × Frequency interaction (i.e., greater priming in τ for low-frequency targets) only when nonwords were pseudohomophones.
Overall, the results indicate that vocabulary knowledge indeed predicts the word frequency by semantic priming interaction, as reflected by between University comparisons. In order to assess the influence of vocabulary knowledge more directly, we also conducted analyses of covariance (ANCOVAs) as a function of vocabulary knowledge (lower vs. higher), with nonword type as a covariate. Participants from the two universities were combined, effectively ignoring the University variable. The three-way Priming × Frequency × Vocabulary Knowledge interaction was not significant. However, using a median split to dichotomize Vocabulary Knowledge, a continuous measure, would have diminished the statistical power of our analysis (Cohen, 1983; Humphreys, 1978; Maxwell & Delaney, 1993). When we used the top third and bottom third of the Shipley scores to define high- and low-vocabulary-knowledge participants respectively, while at the same time ensuring that each counterbalancing list was equally represented across participants in each group, the three-way interaction was indeed reliable, F (1, 101) = 4.45, MSE = 607.43, p = .037, ηp2= .042. The Priming × Frequency interaction was reliable for the low-vocabulary-knowledge group, who showed greater priming for low-frequency targets (d = 60 ms) than for high-frequency targets (d = 33 ms), p = .008. High-vocabulary-knowledge participants showed more similar priming for low-frequency (d = 44 ms) and high-frequency (d = 34 ms) words, p = .096. Importantly, the significant three-way interaction was totally mediated by the τ parameter, F (1, 101) = 6.99, MSE = 870.85, p = .009, ηp2= .065, which is consistent with the idea that the tail of the RT distribution is especially sensitive to the stability of lexical representations.
We further pursued this pattern by conducting hierarchical multiple regression analyses to determine whether participants’ vocabulary scores predicted the magnitude of their Priming × Frequency interaction after controlling for appropriate variables. We entered participants’ age and years of education in the first step, whether they received legal nonwords (i.e., 0) or pseudohomophones (i.e., 1) in the second step, and their vocabulary raw score in the final step. The dependent measures were the “interaction” scores (i.e., the difference in the priming effect for low-frequency targets and for high-frequency targets) in mean RTs, error rates, μ, σ and τ. (We also conducted parallel analyses using z-transformed RTs to control for overall differences in processing speed, see Faust, Balota, Spieler, & Ferraro, 1999, and found qualitatively identical findings.) Importantly, after partialling out variance accounted for by age, years of education, and nonword type, vocabulary scores predicted the interaction score in RTs [β = −.168, p = .063] and τ [β = −.195, p = .033], but not in error rates [β = .012, t < 1], μ [β = .037, t < 1], or σ [β = .001, t < 1]. The negative regression coefficients indicate that participants with more vocabulary knowledge produced smaller priming differences between low- and high-frequency targets in both mean RTs and in τ5. Together with the ANOVAs described earlier, these regression analyses provide converging evidence that higher-knowledge readers are more likely to produce additive effects of priming and frequency, while lower-knowledge readers are more likely to produce interactive effects of the two variables, with the interaction primarily occurring in the tail of the RT distribution.
Interestingly, we also have access to a new dataset that provides converging support for the findings reported in this study. This dataset is based on an in-progress multiuniversity primed word recognition megastudy that includes participants from WUSTL and SUNY-A. Importantly, using tests from the Woodcock-Johnson Tests of Achievement (WJ III; Woodcock, McGrew & Mather, 2001), participants’ vocabulary knowledge was assessed by asking them to generate synonyms and antonyms for printed words, and having them complete analogies (e.g., elephant – big; mouse – ?). As before, participants from WUSTL and SUNY-A were combined, and for each participant, a composite measure of vocabulary knowledge based on WJ III scores (Synonyms, Antonyms, & Analogies) was computed. We used the top third (n = 82) and bottom third (n = 82) of the WJ III scores to define high- and low-vocabulary-knowledge participants respectively, while ensuring that each counterbalancing list was equally represented across participants in each group. In lexical decision performance, the Priming × Frequency interaction was larger for low-knowledge participants (d = 27 ms) than for high-knowledge participants (d = 15 ms). More importantly, the critical three-way interaction between priming, frequency, and vocabulary knowledge approached or reached significance for raw RTs (p = .092) and z-transformed RTs (p = .016). Hence, the three-way interaction can be replicated on an independent sample of participants using a different measure of vocabulary knowledge.
This study yielded the following noteworthy findings. First, in line with Plaut and Booth (2000), semantic priming and word frequency do not always interact in lexical decision performance. However, in contrast to Plaut and Booth, whose low-perceptual-ability participants yielded additive effects, additive effects were associated with higher-vocabulary-knowledge readers while interactive effects were associated with lower-vocabulary-knowledge readers. Second, the RT distributional analyses revealed interesting new constraints on the semantic priming effect that replicate and extend the distributional effects reported in Balota et al. (2008).
Since Becker (1979) first reported larger semantic priming effects for low-frequency targets compared to high-frequency targets, the semantic priming by frequency interaction has become a benchmark finding in the semantic priming literature (see Neely, 1991, and McNamara, 2005, for reviews). The present study, along with Plaut and Booth (2000), indicates that the interaction may not be as robust as researchers have heretofore assumed. Instead, whether the two variables interact or not seems to depend on individual differences, as reflected by perceptual ability and vocabulary knowledge. In Plaut and Booth’s study, high-perceptual-ability participants produced interactive effects while low-perceptual-ability participants produced additive effects. In our study, we observed a different pattern; additive effects when participants had more vocabulary knowledge, and interactive effects when participants had less vocabulary knowledge. If perceptual ability and vocabulary knowledge both broadly reflect the fluency of lexical processing, then one would expect the same three-way interaction in both studies. The puzzling discrepancy will be discussed in greater depth later.
Returning to our results, the collective analyses suggest that the semantic priming by frequency interaction is more likely to emerge for low-vocabulary-knowledge participants6. For high-vocabulary-knowledge participants, priming and frequency effects were additive at the level of the mean response latency, and this additive pattern persisted whether legal nonwords or pseudohomophones were used as distracters. More intriguingly, the distributional analyses (see Table 3 and Table 5 and Figure 3 and Figure 4) indicate that semantic priming was primarily reflected by a shift in the RT distribution, replicating the findings reported by Balota et al. (2008). Distributional shifting is most consistent with a simple head-start mechanism in lexical processing, whereby the effect of the prime is to pre-activate the target representation, which then speeds up lexical access by some constant amount of time. Interestingly, the RT distributions for high- and low-frequency targets were shifted to the same extent by semantic priming, likely reflecting the fact that we controlled for associative strength across high- and low-frequency words.
In contrast, for participants with relatively less vocabulary knowledge, priming and frequency clearly interacted, with larger priming effects for low-frequency targets. Obviously, these results are more consistent with the extant literature, where greater priming for low-frequency targets is usually reported. When one considers the legal nonword condition (E3), the distributional analyses (see Table 7 & Figure 5) revealed that priming for high-frequency targets was reflected predominantly by distributional shifting, while priming for low-frequency targets was reflected by shifting and greater variability in modal RTs. Specifically, for high-frequency targets, the magnitude of the priming effect was relatively invariant across vincentiles, while for low-frequency targets, priming effects increased in size as RTs became longer. These trends are even clearer when word-nonword discrimination difficulty was increased by using pseudohomophonic nonwords (see Table 9 & Figure 6). Here, even high-frequency targets showed some evidence of distributional skewing in priming (although these trends were not statistically significant), while for low-frequency targets, priming effects were relatively stable across the first five vincentiles, but increased dramatically (from 40 ms to 120 ms) in the slower vincentiles.
The present findings can be reconciled with the lexical integrity hypothesis in a straightforward manner. For low-lexical-integrity participants with less vocabulary knowledge, pure distributional shifting, and its attendant head-start mechanism, was observed only when targets were strongly represented (i.e., high-frequency words). When targets (i.e., low-frequency words) had relatively less integrity, target processing was further from threshold, and there was greater reliance on prime information for resolving these difficult targets, with reliance being proportional to the difficulty of the trial. For high-lexical-integrity participants with more vocabulary knowledge, high- and low-frequency words were fluently processed due to their equally strong representations; here, priming reflected a simple head-start mechanism. In fact, these results mesh well with Balota et al.’s (2008) study of the joint effects of target stimulus quality and priming on lexical decision and speeded pronunciation performance. In that study, when target words were presented clearly, priming produced a simple shift in the RT distribution, but when words were visually degraded, priming effects became larger as RTs became longer. Degrading target words increased processing difficulty, which in turn increased reliance on prime information. According to Balota and colleagues (2008), this is consistent with the idea that when target processing is relatively degraded, the system increases reliance on (or retrieval of) the prime information to resolve the degraded target, consistent with recent arguments by Bodner and Masson (2001). Collectively, these findings can also be seen as compatible with the interactive compensatory framework (Stanovich, 1980), which proposes that priming is more automatic for fluent lexical processing and more strategic for less fluent lexical processing. Of course, we need to acknowledge that vocabulary knowledge, as measured by Shipley raw scores alone, is at best a relatively crude proxy for the integrity of underlying lexical representations. Future work examining lexical integrity should consider using more global measures of vocabulary knowledge (e.g., tests of synonyms, antonyms, and lexical analogies on the WJ III, Woodcock et al., 2001). Interestingly, using the WJ III measures to quantify individual differences in lexical integrity yielded the critical three-way interaction, as shown in the composite analyses. More notably, lexical integrity is multidimensional and reflects the quality of the orthographic, phonological, and semantic constituents of a representation, as well as the mapping between these constituents (Perfetti & Hart, 2002). A constellation of tasks that examine spelling performance, retrieval of pronunciations, and identification of meanings should therefore yield a more fine-grained measure of individual differences in lexical integrity.
It is important to note that although pure distributional shifting is compatible with a head-start mechanism of priming, distributional shifts can also be produced by changes in the decision criterion, i.e., the amount of evidence required before a decision is made. For example, in evidence accumulation models such as the random-walk model, altering the decision criterion affects the μ component (distributional shifting) but has no effect on σ (skewing) or τ (skewing) (Spieler, Balota, & Faust, 2000; Yap et al., 2006). Can the results in the present study be reconciled with a simple criterion-based mechanism of priming, whereby participants set a higher response threshold for targets preceded by unrelated primes? This account is problematic for two reasons. First, it is difficult to explain, in a principled manner, why priming shifts decision criteria to the same extent for high- and low-frequency words in readers with more vocabulary knowledge, but shifts them to different extents in readers with less vocabulary knowledge. Second, even if we assume that priming purely reflected changes in decision criteria, this clearly does not accommodate the pattern in E3, where priming was mediated by μ and σ for low-frequency words, or in E4, where priming reflected changes in both μ and τ, particularly for the low-frequency targets.
To summarize, these findings suggest that the nature of priming mechanisms may be modulated by the fluency of target processing. The priming data from E1 and E2 always reflected a shift, regardless of target frequency or nonword type. Distributional shifting is most easily reconciled with a relatively modular lexical processing system in readers with high quality underlying lexical representations, where the effect of the prime is to afford the same head-start to all targets. In contrast, when lexical processing becomes more difficult, the system becomes increasingly sensitive to useful contextual information, and flexibly relies more on prime information (see Balota & Yap, 2006, for a discussion of flexible lexical processing).
So far, our discussion has focused on the effects of priming and frequency on RTs. In RTs, one observes additive effects for readers with more vocabulary knowledge, and interactive effects for readers with less vocabulary knowledge. The trends are less clear when accuracy is the dependent variable. In E1 and E2, despite additivity in RTs, there was an overadditive priming by frequency interaction in accuracy. To further explore these results, we calculated the magnitude of the Priming × Frequency interaction in accuracy and RTs for each participant. We then correlated RT and accuracy interactions, and found that the correlations were not significant in both E1 (r = −.046) and E2 (r = .041), confirming that the additive effects are not simply an artifact of a speed-accuracy tradeoff. It is also worth noting that vocabulary knowledge did not predict the size of the interaction in error rates.
One might argue that the differences between the higher- and lower-vocabulary-knowledge readers can simply be attributed to a shift in response criteria. That is, higher-knowledge participants may be responding faster but making more errors in cases where lower-knowledge participants are responding slower but more accurately. We are skeptical that the present findings can be fully accommodated by this account. First, mean accuracy rates for higher- and lower-knowledge participants were very similar across the different experimental conditions, and in fact did not differ significantly (F < 1). If response criteria indeed varied as a function of vocabulary knowledge, then one would expect accuracy rates to be significantly lower for higher-knowledge participants for the most difficult trials (i.e., the low-frequency unrelated targets). Second, the account implies a speed-accuracy tradeoff for difficult trials, with lower-knowledge participants sacrificing speed for accuracy and higher-knowledge participants sacrificing accuracy for speed. However, again, there was no evidence of a speed-accuracy tradeoff in the difficult low-frequency unrelated condition. Specifically, the correlations (all non-significant) between RTs and accuracy were −.198, .039, −.068, and .135 respectively in the four experiments.
More importantly, the present analyses indicate that classifying the joint effects of priming and frequency as either additive or interactive is probably too inflexible. It might be more useful to conceptualize additivity and interactivity as poles of a continuum, with many intermediate positions in between. It is likely that the effects produced by the WUSTL and SUNY-A samples represent different points on the continuum, and participants can be “pushed” to show greater additivity or interactivity depending on a constellation of factors, including word frequency, vocabulary knowledge, perceptual degradation, and possibly prime-target associative strength. In this framework, one can consider the SUNY-A participants more “interactive” because they produce a significant interaction in both RTs and accuracy, whereas the WUSTL participants are more “additive” because they produce a significant interaction in accuracy and a non-significant trend towards greater priming for low-frequency words in RTs. In fact, vocabulary knowledge and word frequency are continuous variables, and we have obviously only selected two levels of both in the present study. In principle, if one had a larger range of word frequencies than in the present study, it is likely that we could have produced an interaction in RTs even for our high-knowledge readers. Again, this is consistent with the Balota et al. (2008) study in which stimulus degradation produced a reliable interaction in the tail of the distribution even for the WUSTL sample. Although this may sound remarkably similar to the Plaut and Booth (2000) sigmoid function, we will discuss in the next section how the specific results in the present study are not that easy to reconcile with that function. Ultimately, to simultaneously accommodate RT and accuracy data in a principled manner, one needs an explicit model, such as Ratcliff, Gomez and McKoon’s (2004) diffusion model of lexical decision performance. Such an approach may also provide insights into the differences between higher- and lower-knowledge readers.
As discussed, the interaction between semantic priming and word frequency has been considered one of the benchmark effects in the word recognition literature. However, our results show that this “benchmark effect” is modulated by individual differences. Participants with (relatively) less vocabulary knowledge produce an interaction, while higher vocabulary knowledge participants produce additive effects. We tested this more rigorously across all participants by examining whether vocabulary knowledge, as measured by Shipley raw scores, predicted the magnitude of the Priming × Frequency interaction (i.e., larger priming effects for low-frequency targets), after controlling for chronological age, years of education, and nonword type. These regression analyses confirmed that vocabulary knowledge and the magnitude of the Priming × Frequency interaction were negatively correlated in RTs and τ (measure of distributional skewing), providing converging evidence that participants with less vocabulary knowledge were indeed more likely to produce larger priming effects for low-frequency targets. Furthermore, the influence of vocabulary knowledge on the interaction was primarily mediated by participants’ slowest RTs, which reflect the most difficult trials for a participant.
Interestingly, this pattern is the exact opposite of what Plaut and Booth (2000) found. If we assume that vocabulary knowledge and perceptual ability map onto lexical input strength in the same way in the single-mechanism model (see Figure 2), how might one account for the discrepancy? Perhaps the inconsistency between the two studies is more apparent than real. One way for the model to accommodate the data is to assume that low-knowledge readers are represented at the leftmost steep portion of the activation function, and high-knowledge readers are represented at the middle, gradual portion of the function. This will allow low-knowledge readers to show interactive effects and for high-knowledge readers to show additive effects. There are two problems with this “solution”. If low-knowledge readers are positioned at the leftmost end of the curve, where the input-output function resembles a power function, then this should yield larger priming effects for high-frequency targets, because effects are larger for stronger inputs on this portion of the continuum. In our study, however, the low-knowledge readers produced larger priming effects for low-frequency targets. The second problem reflects the predictions made for readers with very high vocabulary knowledge (i.e., higher than the current WUSTL sample), who should be represented at the rightmost portion of the function. The function predicts interactive effects of priming and frequency for such readers. This pattern seems most improbable given the present results.
Alternatively, it is possible that the low-knowledge participants in our study actually correspond to the high-perceptual-ability participants in Plaut and Booth’s (2000) study. Hence, these participants produced an interaction because they are located within the portion of the curve where there is a Priming × Frequency interaction. In contrast, the high-knowledge readers are located further up in the asymptotic portion of the curve where RTs are faster but the interaction is smaller due to the ceiling effect. In fact, Plaut and Booth (2006) used this explanation to account for Borowsky and Besner’s (1993) finding that visually degrading words strengthened rather than weakened the priming by frequency interaction. Of course, the foregoing discussion is based on a somewhat simplistic approach to accommodating empirical effects within the sigmoid function. Given the flexibility of the function, it is important to impose appropriate constraints when evaluating it, and this is more challenging than typically assumed (see Besner & Borowsky, 2006, and Plaut & Booth, 2006, for more discussion). More specifically, the sigmoid function does not literally describe the operations of the single-mechanism model implemented by Plaut and Booth (2000). Rather, it is at best a metaphor for the actual behavior of the model (Plaut & Booth, 2006). In fact, Plaut and Booth (2006) demonstrated that their model could simulate empirical results which were inconsistent with the most straightforward interpretation of the sigmoid function. Hence, our criticisms of the sigmoid function may not apply to the actual implemented model.
The discrepancy between Plaut and Booth’s (2000) study and ours may also be due to the fact that perceptual ability reflects amodal decoding speed while vocabulary knowledge reflects the integrity of underlying lexical representations. As we have suggested earlier, the extent to which prime information is retrospectively retrieved depends on how effortful it is to resolve a lexical target. This type of effort may be related to the integrity of lexical representations (tapped by vocabulary knowledge) but not to perceptual decoding speed (tapped by perceptual ability). In fact, Plaut and Booth’s (2000) study provides some support for this dissociation. First, as mentioned in the Introduction, perceptual ability and vocabulary knowledge (as measured by the PPVT-R) were uncorrelated in their sample (r = .09), indicating that perceptual ability and vocabulary knowledge are measuring distinct constructs. Second, some aspects of their data may actually mirror our general findings. In Experiment 2, they examined the joint effects of priming and frequency in children, with age (3rd grade vs. 6th grade) as the between-participants variable. Presumably, age should be a good proxy for vocabulary knowledge. Interestingly, they reported that sixth graders produced overadditive effects of priming and frequency (with larger priming effects for low-frequency words), while third graders produced a more additive pattern (p. 797). On initial consideration, this seems quite consistent with the three-way interaction they obtained when perceptual ability was the between-participants variable. However, results presented in other portions of Plaut and Booth’s paper suggest that contrary to what they reported, their third graders (i.e., the readers with less vocabulary knowledge) were actually showing more interactive effects of priming and frequency. Specifically, in both Figure 5 (p. 797) and the Appendix (p. 823), third graders appear to be showing larger priming effects for low-frequency (d = 75 ms) than high-frequency (d = 47 ms) words, whereas sixth graders showed more similar priming effects for low-frequency (d = 37 ms) and high-frequency (d = 46 ms) words. Plaut and Booth also conducted another analysis which compared adults to children. Here, they found that both their adults and children produced similarly sized overadditive effects of priming and frequency. If children have less vocabulary knowledge than adults, our account predicts stronger interactive effects for the children, relative to the adults; however, this was not the observed pattern. Plaut and Booth (p. 798) indicated that “this comparison is difficult to interpret” because the children and adults did not receive the same set of words (difficult words were eliminated for the children), and children and adults were also different on perceptual ability and other factors. It is plausible though that the magnitude of the interaction was similar for children and adults because the difficulty of the items was calibrated for their respective vocabulary knowledge. Our study, in contrast, used the same words for both the WUSTL and SUNY-A readers, and these words must have been relatively more difficult for the SUNY-A than for the WUSTL sample.
However, we need to acknowledge that our account does not offer an obvious explanation for Plaut and Booth’s (2000) findings, i.e., additive effects for low-perceptual-ability readers and interactive effects for high-perceptual-ability readers. We have argued that fluent lexical processors are more likely to yield additive effects, but there is no principled reason why low-perceptual-ability readers should be more fluent lexical processors, especially since they were actually substantially slower than the high-perceptual-ability readers on the lexical decision task (see Plaut & Booth, Figure 3). Rather than contriving a post hoc explanation for Plaut and Booth’s results, we suggest that the distinct patterns of results associated with perceptual ability and vocabulary knowledge provide interesting questions for future research. At the very least, the dissociations between the two individual differences measures indicate that it may be misleading to map them onto a single dimension (e.g., input strength on the single-mechanism model), and attempting to accommodate both measures under a unified theoretical framework may not be the best approach.
To recapitulate, the present findings, along with the results from Balota et al. (2008) indicate that as processing fluency decreases, due to low integrity representations, visual degradation, or increased task demands, priming effects become larger, particularly for the most difficult targets in the tail of the distribution. These results further underscore the importance of considering individual differences in visual word recognition. An effect that is identified in a particular sample may not generalize to other sites, making it important to replicate novel effects across samples that may vary with respect to processing fluency (cf. Yap, Balota, Tse, & Besner, 2008). In addition, it is clear that the lexical processing system is remarkably flexible and adaptive, and can show greater reliance on the semantic context as target processing becomes more difficult. This could be viewed as consistent with Hutchison’s (2007) finding that participants, particularly those who are high in attentional control, are sensitive to relatedness proportion in a priming experiment, producing larger priming effects as relatedness proportion becomes higher. However, further work is needed to better understand the extent to which the present effects are under strategic control.
The nonword type manipulation (legal nonwords vs. pseudohomophones) produced an intriguing counterintuitive finding with respect to priming effects. The presence of pseudohomophones attenuated the magnitude of priming effects. More specifically, the priming effect was smaller in μ for both the WUSTL and SUNY-A samples when a pseudohomophone context was used, compared to a legal nonword context. It is indeed intriguing for an effect to become smaller in the lexical decision task as discrimination becomes more difficult. In contrast, the effect of nonword context on the word frequency effect was precisely as predicted, i.e., we replicated the well-established finding of a larger frequency effects in the pseudohomophone context compared to the legal nonword context (see Stone & Van Orden, 1993; Yap et al., 2006). Hence, the present results yielded the noteworthy pattern that the presence of pseudohomophones simultaneously increased word frequency effects and decreased semantic priming effects.
Why would priming effects become smaller in a pseudohomophone context? To address this question, it is first necessary to reiterate that the lexical decision task is primarily a binary discrimination task whose difficulty is a function of the overlap between words and nonwords. If we conceptualize the word recognition system as a collection of processing modules and pathways that support the computations mediating orthography, phonology, and meaning (Balota, Paul, & Spieler, 1999), the type of nonwords used in a lexical decision task may engage attentional control systems that appropriately adjust the weights between different modules. Specifically, in the standard lexical decision task where legal nonwords are used, participants are attempting to discriminate between familiar/meaningful words and relatively unfamiliar/meaningless nonwords, and therefore emphasize the connections between orthography and meaning. However, pseudohomophones increase the familiarity/meaningfulness overlap between words and nonwords, and hence this dimension becomes less informative for word-nonword discrimination. In fact, using familiarity/meaningfulness tends to increase the false alarm rate since pseudohomophones (e.g., BRANE), by design, are constructed to activate meaning-based information, via an orthographic-phonological pathway. In such a situation, the system may deemphasize the pathway between orthography and meaning, and rely less on meaning, thereby reducing the semantic priming effect.
This prediction is consistent with the connectionist triangle model perspective (Plaut, 1997; Seidenberg & McClelland, 1989), where task demands modulate the extent to which participants attend to different types of lexical information (i.e., orthographic, phonological, and semantic) in lexical decision. For example, when nonwords are illegal (e.g., BRNTA), orthographic familiarity is sufficient for driving word-nonword discrimination. With legal nonwords (e.g., BRONE), phonological, rather than orthographic, familiarity is recruited. However, when distracters are pseudohomophones (e.g., BRANE), which look and sound like real words, only semantic familiarity is viable for decision-making. Hence, this account suggests that the decreased priming could be due to the attenuation of the phonology → semantics pathway, which would help suppress false alarms in the context of pseudohomophones. The foregoing discussion is necessarily speculative, but it does suggest that our counterintuitive finding can be accommodated either within a flexible lexical processing framework (Balota et al., 1999) or some version of the triangle model where pathway control is implemented. Both accounts of this intriguing finding merit further study.
One of the original objectives of the present study was to establish whether the interactive effects of priming and frequency were more consistent with multiple independent stages (Sternberg, 1969) or with a single non-linear mechanism (Plaut & Booth, 2000). As it turns out, the answer to this question was less clear-cut than anticipated. Most critically, the interaction between priming and frequency was stronger for our high-vocabulary-knowledge readers than our low-vocabulary-knowledge readers. As discussed, it is not easy to reconcile these findings with Plaut and Booth’s (2000) sigmoid activation function (see Figure 2), although as pointed out in a previous section, criticisms directed against the sigmoid function may not necessarily apply to the actual implemented model (see Plaut & Booth, 2006). In principle, the data can be accommodated by the multistage perspective, using additive factors logic to revise extant assumptions. It must be emphasized, however, that these modifications are post hoc and need to be empirically verified in future studies.
To recapitulate, in E1 and E2, additive effects of priming and frequency were observed, which is consistent with priming and frequency influencing independent stages. In E3 and E4, interactive effects were observed, which is consistent with priming and frequency influencing a common stage. This suggests that for high-lexical-integrity participants, priming only influences an earlier perceptual stage by providing a head-start for subsequently presented targets, while frequency influences a later lexical retrieval stage. For low-lexical-integrity participants, priming exerts effects on both the early stage as well as the later lexical retrieval stage. How does priming influence the subsequent lexical retrieval stage? As we have argued in previous sections, as target processing increases in difficulty, the reliance on the prime information increases, especially for the low-frequency targets.
The precise mechanisms which mediate the effects of target difficulty on semantic priming influence remain unclear. Possibly, when targets are difficult to process, it is more likely that there is episodic retrieval of the prime (Bodner & Masson, 1997), hence increasing its influence. Alternatively, if we adopt the perspective of the multistage activation model (Borowsky & Besner, 1993), related primes lower the response criterion (see Figure 1). In addition, the more difficult the target is, the more the criterion is lowered. Since the rate of evidence accumulation is steeper for high-frequency targets than for low-frequency targets, a constant change in criterion for both classes of words should yield larger priming effects for low-frequency targets. Let us further assume that for all high-frequency targets, the lowering in criterion due to the related prime is relatively invariant, since none of the high-frequency targets are particularly difficult. On the other hand, low-frequency targets are more variable with respect to difficulty, and one expects the criterion to be lowered more for more difficult items. This will explain why priming effects become larger across the RT distribution for low-frequency, but not high-frequency, targets.
There is an alternative stage-based account that does not require an appeal to episodic retrieval of the prime (Bodner & Masson, 1997). The multistage activation model (Borowsky & Besner, 1993) contains feedforward and feedback pathways between the orthographic input lexicon and the semantic system. Importantly, there is evidence that feedback from the semantic system to the orthographic input lexicon is neither mandatory nor automatic (Stolz & Neely, 1995). Instead, the feedback mechanism operates only when it is beneficial. For example, Stolz and Neely reported additive effects of priming and stimulus quality when relatedness proportion was low (RP = .25) but an overadditive interaction when relatedness proportion was high (RP = .50); interactive effects indicate semantic feedback while additive effects indicate no feedback. These results are consistent with the idea that when relatedness proportion is low, feedback from the semantic system is eliminated, because this feedback is not useful on the majority of trials. Similarly, one could argue that there is less semantic feedback for readers with high integrity lexical representations, because such readers process lexical targets fluently and hence there is relatively less benefit from related primes. Again, the results attest to the flexibility of the lexical processing system in accomplishing task goals.
The present study examined the joint effects of semantic priming and word frequency in lexical decision performance. The intriguing finding was that these two factors do not always interact. In fact, whether priming and frequency interact depends on the vocabulary knowledge of the participant. Readers with less vocabulary knowledge show larger priming effects, particularly for difficult low-frequency targets that fall into the tail of the RT distribution, and this is consistent with the idea of a flexible lexical processing system that optimizes task performance by emphasizing task-relevant information. In contrast, the lexical processing system of readers with more vocabulary knowledge appears to be more modular in nature, whereby the effect of a prime is primarily to provide a head-start that is independent of a target’s difficulty, i.e., shifts the RT distribution. From a methodological point of view, this study also underscores the need to extend visual word recognition research by considering individual differences and by analyzing how variables influences the underlying RT distribution.
This research was supported by National Institute of Aging Grant AG03991 and National Science Foundation Grant BCS0001801 to D. A. Balota. We thank Derek Besner, Andrew Heathcote, Debra Jared, Jim Neely, and Dave Plaut for their constructive comments on an earlier version of this paper, Viviana Benitez for her assistance with stimuli development and data collection, Jim Neely and Matt Thomas for their help with data collection in SUNY-Albany, and Keith Hutchison for his help with the primed lexical decision megastudy analyses.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1Shulman and Davison (1977) reported larger semantic priming effects in lexical decision when legal, compared to illegal, nonwords were used, but did not examine priming effects in the context of pseudohomophones. The study that comes closest to addressing this issue is one by Milota, Widau, McMickell, Juola, and Simpson (2000), who used the primed lexical decision task to prime real words, legal nonwords, or pseudohomophones. For example, the participant could see doctor (real word), docton (legal nonword), or docter (pseudohomophone) primed by either nurse (related) or win (unrelated). Interestingly, Milota et al. reported that pseudohomophone distracters, compared to legal nonword distracters, attenuated semantic priming, and attributed this effect to participants strategically suppressing the influence of a prime when it was less helpful for word-nonword discrimination. Specifically, there was a prime-target relationship for both word (nurse – doctor) and nonword (nurse – docter) trials. Note, however, that Milota et al.’s paradigm is clearly different from ours. Half their pseudohomophones were primed by related words, inducing strategic suppression of prime information, whereas our pseudohomophones are never related to their primes. Hence, whether pseudohomophones increase the priming main effect and the priming by frequency interaction remain open empirical questions.
2In order to secure high-quality pseudohomophones, the 300 pseudohomophones used in E2 and E4 were selected from the appendices of published articles. The limited pool of available pseudohomophones, and the need to match these stimuli to words on length, imposed constraints that made it impossible for us to match the orthographic neighborhood of words and nonwords as closely as one would like.
3In the present and subsequent analyses involving the Vincentile, we used the Greenhouse-Geisser correction for potential violations of sphericity.
4The WUSTL sample had more years of education and higher vocabulary scores than the SUNY-A sample. In order to address this confound, we included years of education as a covariate and re-ran all the ANOVAs that included University as a between-subject factor. Importantly, these analyses yielded qualitatively similar trends, confirming that vocabulary knowledge, rather than years of education, was driving the group differences. This is consistent with a follow-up analysis where we found a significant difference in vocabulary scores between participants from the two universities, even after chronological age and years of education are controlled for [F (1, 142) = 40.96, MSE = 8.017].
5After removing the outlier participant (whose vocabulary score was 15, more than 3 SDs below the overall mean, i.e., 32.6), we found qualitatively similar results in the hierarchical regression analyses. That is, after partialling out variance accounted for by age, years of education, and nonword type, vocabulary scores still predicted the interaction score in RTs [β = −.165, p = .068] and τ [β = −.229, p = .011], but not in error rates, μ, or σ, ts < 1.
6One might contend that the lack of an interaction in the WUSTL participants is simply due to their being less sensitive to the word frequency and semantic priming manipulations, compared to the SUNY-A participants. In other words, is the frequency range used in the present study simply not sufficient for detecting an interaction in the WUSTL sample? We do not think so, for the following reasons. As pointed out, WUSTL participants showed robust main effects of priming and frequency in both E1 and E2. Moreover, if we compare E1 and E3 (across the two samples), the priming effects for high-frequency targets were practically identical in the two experiments (E1: 40 ms, E3: 40 ms), along with the main effects of target frequency (E1: 25 ms, E3: 28 ms).