|Home | About | Journals | Submit | Contact Us | Français|
Despite the fact that list-method directed forgetting instruction leads to decreases in memory performance on tests of free recall, there are to date no published reports of comparable effects in recognition testing. This paper evaluates whether conditions that promote the importance of context in recognition, either via stimulus selection (Experiments 1 and 2) or by test choice (Experiment 3), elicit directed forgetting impairment. In all three experiments, we obtained reliable recognition deficits, suggesting that the typical conditions of recognition, rather than recognition itself, underlie the discrepancy between the tests of recognition and free recall in list-method directed forgetting.
In directed forgetting studies, participants are instructed to either forget or remember certain earlier studied items. Instructions to forget or remember can be delivered either on an item-by-item basis or after an entire list of items has been presented (known as the item-method and the list-method of directed forgetting, respectively). Directed forgetting instructions reduce memory for to-be-forgotten (TBF) items compared to to-be-remembered (TBR) items (for reviews, see E. L. Bjork, Bjork, & Anderson, 1998; MacLeod, 1998; and Johnson, 1994) on tests of free recall and recognition when the item method is used (e.g., Basden, Basden, & Gargano, 1993; MacLeod, 1999), but only on recall tests when the list method is used (Basden et al., 1993; Benjamin, 2006; Bjork & Bjork, 2003; Block, 1971; Conway, Harries, Noyes, Racsma'ny, & Frankish, 2000; Elmes, Adams, & Roediger, 1970; Geiselman, Bjork, & Fishman, 1983; Gross, Barresi, & Smith, 1971; Sahakyan & Delaney, 2005; Whetstone, Cross, & Whetstone, 1996). In this paper, we argue that the list-method directed forgetting impairment reflects the deleterious effect of a study-test context mismatch, and demonstrate that such effects emerge in recognition when stimuli are used or tests are employed that enhance the utilization of contextual characteristics.
In a typical list-method directed forgetting experiment, participants study two lists of items for a memory test. After the first list, half of them are told to forget that list because it was only for practice, or it was the wrong list, and therefore there is no need to remember those items. The remaining participants are told to keep remembering the list. Both groups proceed to study a second list, followed by a memory test for all items (including any items participants were instructed to forget). In free recall, the directed forgetting effect consists of two components -- the costs and the benefits of directed forgetting. The costs refer to impaired recall of List 1 items in the forget group compared to the remember group, whereas the benefits refer to enhanced recall of List 2 items in the forget group relative to the remember group. Our focus in this paper is the enigmatic absence of directed forgetting costs in recognition, despite the fact that the benefits appear to be robust (e.g., Benjamin, 2006; Sahakyan & Delaney, 2005). The present strategy for understanding why the costs are not apparent involves exploring conditions under which those costs can be made to re-appear. To do so, we take a lesson from studies of the role of global environmental context in recognition. First, however, we briefly review how the findings from recognition of intentionally forgotten material are problematic for historical and contemporary theoretical accounts of directed forgetting.
Single-process accounts of list-method directed forgetting invoke the same underlying mechanism to explain the costs and the benefits of directed forgetting, and have included selective rehearsal (e.g., Benjamin, 2006; Bjork, 1970, 1972; Sheard & MacLeod, 2005), as well as retrieval inhibition (e.g., Bjork, 1989; Bjork & Bjork, 1996; 2003; Geiselman et al., 1983). According to the selective rehearsal account, the presentation of a forget cue terminates rehearsal of all preceding items and extra rehearsal is therefore devoted to post-cue items, leading to the costs and the benefits respectively. According to the inhibitory account, the costs of directed forgetting reflect a temporary state of inhibition of the TBF items. The benefits arise from reduced proactive interference acting on post-cue items from inhibited TBF items in the forget group.
The dual-process account of directed forgetting relies on two different mechanisms to handle the directed forgetting effect (Sahakyan & Delaney, 2005). The first factor explains the costs by the context-change mechanism of Sahakyan and Kelley (2002). This hypothesis was based on the notion that during encoding, people process not only the meaning of the item and the inter-item relationships, but they also encode the context in which the item occurred. Contextual information refers to the environmental, spatial-temporal, emotional, or mental states in which the item was experienced. Sahakyan and Kelley (2002) proposed that directed forgetting instruction encourages participants to change their mental context between the lists, thereby segregating the two lists as separate events. An example of mental context cue would be the thoughts people experienced during List 1 learning, which in part include how people represent and think about the experimental episode connected with List 1 encoding. According to the contextual hypothesis, directed forgetting impairment arises because retrieval context better matches List 2 than List 1 encoding context, producing reduced recall of List 1 items. The second factor of the dual-process account explains the benefits by the strategy-change mechanism (Sahakyan & Delaney, 2003). According to this explanation, the benefits emerge because the forget cue encourages participants to adopt a more elaborate encoding strategy for post-cue items compared to the remember group. Thus, in the dual-factor account, the costs reflect a mismatch between the study and the test contexts, whereas the benefits reflect a strategic encoding enhancement.
Each of these accounts has difficulty with the finding that the costs emerge in free recall but not in recognition. It is problematic for the selective rehearsal account because both recognition and recall are sensitive to elaborative rehearsal (Geiselman & Bjork, 1980) and recognition is even more sensitive than recall to rote rehearsal (Benjamin & Bjork, 2000; Craik & Watkins, 1973). In other words, regardless of the nature of the rehearsal process involved in directed forgetting, terminating rehearsal of TBF items should have consequences for recognition memory. However, empirical findings to date attest to the contrary. Proponents of the inhibition account explain the lack of directed forgetting effects in recognition tests through the release of inhibition (e.g., Bjork & Bjork, 1996; 2003), by suggesting that re-presenting the TBF items on a recognition test releases their inhibition, thereby eliminating the directed forgetting effect. The inhibitory account is not fully satisfactory in large part because of the lack of theoretical specificity by which inhibition and release from said inhibition operate. Finally, in the dual factor account, the contextual explanation handled the absence of directed forgetting costs in recognition by pointing out that many environmental context manipulations did not affect recognition (Fernandez & Glenberg, 1985; Godden & Baddeley, 1980; Humhreys, Pike, Bain, & Tehan, 1988; Jacoby, 1983; Russo, Ward, Geurts, & Scheres, 1999; Smith, Glenberg, & Bjork, 1978; Smith, Vela, & Williamson, 1988). However, under certain conditions, environmental context effects do emerge in recognition (Canas & Nelson, 1986; Dalton, 1993; Krafka & Penrod, 1985; Malpass & Devine, 1981; Russo et al., 1999; Smith, 1985, 1986; Smith & Vela, 1992) and may lead to the reemergence of costs.
The experiments reported here employ characteristics that are thought to increase the effects of environmental context on recognition. If such contextual effects truly underlie the costs of directed forgetting, then the costs should be apparent under those conditions.
There is a large body of literature showing that memory suffers when there is a low correspondence between the background context present during study and during test (see Smith & Vela, 2001, for a meta-analysis of these findings). These context-dependent memory effects have been observed more often in free recall than in recognition tests, in which results have been more inconsistent.
Manipulations of background context may include a change in the global context such as the physical environment, or a change in the local context such as the text color, location of the word on the screen, the voice of a speaker, etc. The latter is termed local because it accompanies the presentation of each item and is typically fast-changing, whereas the former is characterized as global because it is slow-changing, and involves the encoding of entire set of items in a single environmental context (Glenberg, 1979; Smith & Vela, 2001). Different types of contextual manipulations have produced different effects in recognition, with local context mostly affecting response strategy (e.g., Feenan & Snodgrass, 1990; Murnane & Phelps, 1993; 1995), and global context affecting discriminability, albeit not consistently (for a review, see Smith & Vela, 2001). Our investigation was guided principally by the studies of global environmental context in recognition, because such a conceptualization of context is more similar to the notion of Sahakyan and Kelley's (2002) context change involved in directed forgetting. If directed forgetting instigates a mental context change after an entire set of items have been studied, then this type of context change is better described as the global list-wide context change rather than a local context change that accompanies the presentation of each item.
The first two experiments reported here were motivated by the observation of Smith and Vela (2001) that global environmental context effects in recognition were larger when encoding processes were primarily nonassociative, especially when the study materials involved unfamiliar faces or non-words (Dalton, 1993; Krafka & Penrod, 1985; Malpass & Devine, 1981; Russo et al., 1999; Smith & Vela, 1992). More specifically, our first experiment was inspired by the study of Russo et al. (1999), the design of which is similar to the list-method directed forgetting paradigm, and therefore we discuss it in greater detail.
Participants encoded two lists of non-words in two different rooms (A and B). After a brief delay in a waiting room, they returned to room A and received a Yes/No recognition test on half of the non-words studied in room A, along with half of the non-words studied in room B, plus an equal number of unstudied non-words. Then, they went to room B, and completed the recognition test on the remaining non-words studied in room B or room A, along with new distractors. Russo et al. (1999) reported reduced discrimination accuracy for non-words tested in the mismatching context. A similar experiment failed to obtain the effect of context when the study materials consisted of words.
Russo et al. (1999) proposed several hypotheses to explain their findings, and we revisit them in part because they motivated our directed forgetting studies with non-words. Broadly speaking, these explanations emphasized the association between the item and its context established during encoding, with non-words forming stronger item-to-context associations than words. This could occur because prior history of occurrence of words in multiple contexts might create contextual competition and lead to less specific contextual information being encoded in the memory trace when the words are processed in a specific experimental context. Thus, context cues might be better retrieval cues for non-words than words because non-words have no prior “real world” contextual associations to interfere with experimental item-to-context associations. Research shows that concepts experienced in fewer pre-experimental contexts suffer more when environmental context changes between the study and test, presumably because they form stronger item-to-context associations during encoding compared to concepts that appear in many pre-experimental contexts (Marsh, Meeks, Hicks, Cook, & Clark-Foos, 2006). Similarly, both recognition (Gorman, 1961) and source memory (Guttentag & Carroll, 1997) are superior for uncommon than common words, which tend to be experienced in a lower variety of contexts (Marsh, Cook, & Hicks, 2006; Steyvers & Malmberg, 2003).
Another related explanation proposed by Russo et al. (1999) is that context associated with non-words may act differently from the context associated with words. This reasoning relies on Baddeley's (1982) distinction between independent and interactive contexts. The former refers to context being independently associated with the item without significantly modifying its meaning, whereas the latter refers to context being integrated with an item and changing its meaning (e.g., the meaning of jam is different in the context of strawberry and traffic). Because non-words do not have a stable meaning, context could become interactively encoded with them and elicit a temporary meaning, whereas it may be independently associated with words and have no detrimental effect on the perception of their meaning. A change in interactive context could reduce discrimination accuracy of non-words because the meaning elicited at test may not match the meaning elicited by context during encoding (e.g., Light & Carter-Sobell, 1970; Tulving & Thomson, 1973). In contrast, changes in independent context are not typically hypothesized to change the semantic meaning, and hence words may be more resistant to context effects. Note that when items are deliberately integrated with incidental context (even without being modified interactively by their context), they become more vulnerable to changes of context as compared to when these two sources of information are not integrated with each other and are instead independently encoded (e.g., Murnane, Phelps, & Malmberg, 1999).
Finally, in addition to considering the processes engaged during encoding of non-words, it is also important to discuss the role of retrieval processes involved in recognition judgments because they will play a role during the test. Using the dual-process framework of recognition (e.g., Jacoby, 1991; Mandler; 1980), some researchers have argued that familiarity-based recognition is unaffected by environmental context changes, whereas recognition accompanied by conscious recollection is impaired when context changes at test (e.g., Macken, 2002). Familiarity-based recognition might be sufficient to recognize words, but discriminating non-words might require retrieval of more specific details (e.g., specific letter-order combinations) because non-words have fewer alternative bases for recognition due to having less semantic content. If context change hurts retrieval-based recognition more than familiarity-based recognition, non-words should show more directed forgetting than words.
The current experiments employed manipulations of stimulus type (words versus non-words) and the particular demands of the recognition test (plurality discrimination versus standard recognition) to see if the costs of directed forgetting emerge under circumstances that increase the significance of contextual information.
We implemented the list-method directed forgetting design, but had participants study two lists of non-words rather than words. Memory was tested with an Old/New recognition test.
Fifty-six UNCG undergraduates participated in the experiment in exchange for course credit. They were tested individually. Half of them were assigned to the forget group, and the remaining half were in the remember group.
The stimuli were 112 non-words that were created by a group of undergraduates for a related classroom project (the stimuli are shown in the Appendix, Table A1). They were variations on Bermudian slang words that were altered to reduce their similarity to English words. In pilot testing, a separate group of 18 undergraduates were given the list of non-words and asked to write down the first English word that came to mind in response to the non-words. If a given non-word elicited the same association in two or more people, it was not included in the pool of final stimuli.
From a pool of 112 items we constructed four lists of 28 items (Lists A, B, C, and D). For half of the participants, Lists A and B served as targets, and Lists C and D served as distractors, whereas for the remaining participants, Lists C and D served as targets, and Lists A and B served as distractors. Each target list was assigned equally often to appear as the first or the second study list. Presentation order of the items within each study list was randomized for each participant. Test lists were constructed by randomly mixing the target items from both lists with equal number of distractors from the unstudied lists.
The design of the study was a Cue (forget vs. remember) × Lists (List 1 vs. List 2) mixed factorial, with Lists as the within-subjects factor.
Participants were given two lists of 28 non-words per list to study for an unspecified memory test. They were told to read the non-words aloud and to try to memorize them. Items were presented one at a time on the computer screen at a rate of 4 s per item. After encoding List 1, half of the participants were told that the first list was just for “practice”, and that there was no need to remember it. They were told to try to forget it. The remaining participants were told to remember that list because it was only “the first half of the items”. Then all participants studied List 2, which was followed by a remember instruction for all participants.
Upon completion of the study phase, participants received an Old/New recognition test. They were told that they would be presented with items, some of which they had learned during the experiment, and some of which would be new and had not been presented in the experiment. They were told to press OLD if they remembered studying the non-word either on List 1 or List 2 (in the forget group, the instructions referred to the lists as the “practice” list or the “experimental” list). Otherwise, they were told to press NEW if they did not remember studying the item in the experiment. For each participant, the computer generated a test list by randomly mixing all 56 non-words from List 1 and List 2 with 56 new distractors.
We replaced four participants whose hits rates were below false alarm rates. The hits for List 1 and List 2 items as well as the false alarms are reported in Table 1, along with statistical comparisons across the forget and the remember conditions. The lack of a difference between false alarms and between List 2 hits across conditions suggests that participants in these two groups did not employ different decision strategies. In this section, we report the analyses of the discrimination accuracy (d').
In calculating d' scores, hits and false alarms were calculated by adding .5 to each frequency and diving by N +1, where N represents the number of old items (for calculating hits) or new items (for false alarms). This procedure was recommended by Snodgrass and Corwin (1988) to avoid estimation problems arising from hit rates of 1 and false alarm rates of 0. We conducted a Cue (forget vs. remember) by Lists (List 1 vs. List 2) mixed analysis of variance (ANOVA) on d' scores.
There was a significant List × Cue interaction, F(1, 54) = 4.27, MSe = .098, p < .05. Follow-up tests showed that there was a significant difference in discrimination accuracy of List 1 items between the forget and the remember groups, t(54) = 2.91, p < .05. However, there were no differences between the groups on List 2 items, t < 1. In other words, we observed the costs, but not the benefits of directed forgetting with non-word stimuli.
This is the first demonstration of significant memory impairment in directed forgetting using a recognition test. Prior studies have failed to obtain this effect with word stimuli, whereas the use of non-words allowed us to demonstrate directed forgetting costs in recognition. Given the importance of such findings, we aimed to replicate the costs with more “conventional” non-word stimuli in the next experiment, because the non-words used in this experiment were created in our lab, and perhaps our findings were specific to those stimuli.
Two prior studies reported recognition benefits, but they did not obtain the costs (Benjamin, 2006; Sahakyan & Delaney, 2005). In contrast, Experiment 1 revealed the opposite – significant costs, but no benefits with non-words. If the directed forgetting benefits arise from more elaborative encoding strategies employed by the forget group (Sahakyan & Delaney, 2003), then the absence of benefits with non-words may be due to difficulty applying such strategies to non-associative stimuli, which are considerably harder to interrelate and link to each other than words. However, because such an explanation is post hoc, we sought to evaluate it in the next experiment by including a word condition, where we anticipated obtaining the benefits.
In this study, some participants learned non-words, whereas others learned words in order to allow direct comparison across the stimulus type. The non-words were drawn from prior published studies. We also varied the length of the study lists, with some participants studying two short lists (16 items per list) and some participants studying two long lists (36 items per list).
There were reasons to include a list length manipulation; specifically, in Experiment 1, the study lists included 28 non-words per list, which is longer than the lists used in the majority of the list-method directed forgetting studies. Therefore, it could be that impaired recognition with non-words was mediated by the length of the lists. When Sahakyan and Delaney (2005) first documented directed forgetting in recognition (they found the benefits, but not the costs), they noted that the benefits were obtained with long lists but not the short lists. Given that Experiment 1 is the first demonstration of directed forgetting costs in recognition, it could be that list length in general is involved as a factor in recognition differences in directed forgetting. By including a short-list condition in this experiment, we would be able to evaluate whether the costs would also emerge with 16-item lists of non-words, which is a more common list length for the list-method directed forgetting studies. In the long-list condition, we expected to replicate the costs with non-words that we obtained in Experiment 1. In addition, we wanted to further examine whether the absence of benefits obtained in Experiment 1 was specific to non-word stimuli. If the difficulty of using elaborative encoding with non-words is what precluded us from obtaining the benefits in the previous experiment, then the directed forgetting benefits should be observed in a condition with words, which can be more elaborately encoded. To maximize the chances of obtaining the benefits with words, we included the long list condition because prior studies examining the benefits in recognition have also detected them with longer lists (e.g., Benjamin, 2006; Sahakyan & Delaney, 2005). One explanation for why longer lists contribute to detecting the benefits is that longer lists might create a larger list-length effect in the remember group than in the forget group, because increasing the list length impairs recollection (e.g., Cary & Reder, 2003; Yonelinas & Jacoby, 1994). However, if List 2 items are better encoded in the forget group, then more elaborate encoding can provide support for recollection and offset the list-length effect in the forget group. To summarize, based on prior studies, we expected to obtain the benefits of directed forgetting with long lists in the word condition, and we wanted to examine whether they would also emerge with the non-words.
144 UNCG undergraduates participated in this experiment in exchange for course credit. None of them had participated in Experiment 1. They were tested individually.
The stimuli were 144 words (mean frequency, 60.7 per million; Kucera & Francis, 1967) and 144 non-words. Non-words were taken from the study of Russo et al. (1999), who in turn selected their stimuli from the materials created by McCann and Besner (1987). The word stimuli are shown in the Appendix (see Table A2).
For each participant, two study lists of pure words or pure non-words were constructed. The study lists were either short or long, with short lists containing 16 items and long lists containing 36 items per list. From a pool of 144 items (words or non-words), we constructed four lists of 36 items (Lists A, B, C, and D). For half of the participants, Lists A and B served as targets, and Lists C and D served as distractors, whereas for the remaining participants, Lists C and D served as targets, and Lists A and B served as distractors. Each target list was assigned equally often to appear as the first or the second study list. Presentation order of the items within each list was randomized for each participant. Participants in the long lists condition studied entire List A and B (or C and D) during the study session, and received a test lists containing 72 studied items randomly mixed with equal number of distractors from the unstudied lists. Participants in the short lists condition studied two lists of 16 items randomly selected from List A and B (or C and D). The short recognition test lists contained 32 studied items mixed with 32 distractors selected from the unstudied lists. Half of the distractors were randomly drawn from one unstudied list, and the remaining half was drawn from the second unstudied list.
The encoding and the testing procedures followed Experiment 1.
The design was a Cue (forget vs. remember) × List (List 1 vs. List 2) × Item Type (words vs. non-words) × List Length (short vs. long) mixed factorial, with Lists as the only within-subjects factor.
We replaced two participants in the non-word condition whose hit rates were lower than the false alarm rates. The hit rates for List 1 and List 2 items as well the false alarm rates are reported in Table 2. In this section, we focus on the analyses of the discrimination accuracy (d') and response bias (CL).
In calculating d' scores, hits and false alarms were transformed according to the procedures recommended by Snodgrass & Corwin (1988) described earlier. We conducted a mixed ANOVA on d' scores using Cue (forget vs. remember), List Length (short vs. long), Item type (words vs. non-words), and Lists (List 1 vs. List 2). The results are summarized in Table 3.
There was a main effect of List Length, F(1, 136) =24.17, MSe = .657, p < .001, with better discriminability of short lists (2.12) than long lists (1.65). There was also a main effect of Item type, F(1, 136) =76.08, MSe = .658, p < .001, with better discriminability of words (2.30) than non-words (1.47). There was a significant List × Cue interaction, F(1, 136) = 5.55, MSe = .112, p < .05, signifying a directed forgetting effect. However, this effect depended both on the item type and on the length of the list, as evidenced by a significant four-way interaction, F(1, 136) = 5.90, MSe = .112, p < .05. The remaining effects were not significant. To follow up the interaction, we analyzed the words and the non-words conditions separately.
A List × Cue × List Length ANOVA on d' scores revealed a significant effect of List Length, F(1, 68) = 8.18, MSe = .972, p < .01, confirming better discriminability of short lists (2.54) than long lists (2.07). None of the remaining main effects were significant (Fs < 1), and neither was the Cue × Length interaction (F = 1.13). However, there was a significant 3-way interaction, F(1, 68) = 7.22, MSe = .123, p < .05, indicating that the directed forgetting effect varied as a function of list length. The results are displayed in Figure 1.
To follow-up the interaction, we examined each list length condition separately. There was no directed forgetting with short lists – there was neither a List × Cue interaction, F(1, 34) = 1.59, p = .22, nor a main effect of Cue (F < 1), nor a main effect of List, F < 1. In other words, there were neither directed forgetting costs, nor directed forgetting benefits. In contrast, in the long list condition, there was a significant List × Cue interaction, F(1, 34) = 6.68, MSe = .115, p < .05. Follow-up tests revealed that there were no differences in discrimination of List 1 items between the forget and the remember groups (t < 1), implying that there were no costs of directed forgetting in the long lists. However, there was a significant difference in List 2 discrimination, t(34) = 2.05, p < .05, indicating directed forgetting benefits.
A List × Cue × List Length ANOVA in the non-words condition revealed a main effect of List Length, F(1, 68) = 23.19, MSe = .343, p < .001, indicating better discriminability of the short lists (1.70) than the long lists (1.23). There was also a main effect of Cue (F(1, 68) = 4.19, MSe = .343, p < .05), and a main effect of List (F(1, 68) = 4.96, MSe = .101, p < .05), which were qualified by a significant Cue × List interaction, F(1, 68) = 6.62, MSe = .101, p < .05. Follow-up comparisons revealed significant differences in discrimination of List 1 items between the forget group (1.34) and the remember group (1.69), t(70) = 2.84, p < .01, indicating directed forgetting costs with non-words. However, there was no significant difference in discrimination of List 2 items between the forget and the remember groups, t < 1. In other words, there were no directed forgetting benefits. These findings did not vary as a function of list length. There was neither a 3-way interaction (F < 1), nor a two-way interaction involving the list length (F = 1.18). To summarize, in the non-words condition, there were directed forgetting costs both with the short lists and the long lists. However, there were no benefits in either list length.
Next, we analyzed the response bias measure, CL (Rotello & Macmillan, 2008). Higher numbers in this measure indicate more conservative responding. A Cue × Item Type × List Length ANOVA on CL. revealed a significant main effect of Item type (F(1, 136) = 46.57, MSe = .204, p < .001), indicating a more conservative response bias in the words condition (1.33) than in the non-words condition (.81). There was also a significant main effect of List length (F(1, 136) = 7.29, MSe = .204, p < .01), indicating a more conservative bias in the short list condition (1.17) than in the long list condition (.97). There was neither a main effect of Cue (F < 1), nor any interactions (all Fs < 1). To summarize, participants set a more conservative response criterion in the conditions where the discrimination task was easier, as is typical (e.g., in short lists; with word stimuli; Benjamin, 2003; Benjamin & Bawa, 2004). However, directed forgetting did not affect response bias in any of the experimental conditions.
The results showed that directed forgetting impaired discriminability of List 1 items, and the results were not due to response bias. However, impaired recognition was found only with non-word stimuli, irrespective of the list length. Word stimuli, on the other hand, did not lead to impaired recognition, confirming the findings of earlier directed forgetting studies. Because directed forgetting lowered the hits without affecting the false alarms in the non-word condition, significant differences in discriminability of List 1 items emerged between the forget and the remember groups. In contrast, in the word condition, directed forgetting did not impair discriminability – neither the hits nor the false alarms were lowered. Finally, despite impaired recognition of List 1 items in the non-word condition, there was no associated memory enhancement of List 2 items. The benefits of directed forgetting emerged only with words, and only with long lists. We defer further discussion of these findings to the General Discussion section.
Across two experiments, we detected directed forgetting costs with non-words. However, consistent with published findings in directed forgetting, we did not observe recognition impairment with words.
It could be that the reason we obtained the effect with non-words was that episodic associations established between the non-words and their context played a larger role in recognition of non-words. One reason why this could occur is because it is harder to create inter-item associations with non-words compared to words. Therefore, it could be that when inter-item associations were minimized, non-words relied more heavily on context because they have fewer alternative bases for recognition. However, it could also be the case that directed forgetting with non-words emerged because non-words were more similar to their distractors than the words, requiring more retrieval-based processes to discriminate them from each other. For example, Greene (2004) showed that participants rated non-words to be more similar to the study list (comprised of a mixture of different non-words and words) than the words. Non-words may evoke a greater feeling of similarity to each other because they have less semantic content, and their structural features like letters and phonemes may give rise to a greater feeling of similarity than when the assessment of similarity is mediated by meaning. In contrast, the semantic features of words make them more distinctive and unique compared to non-words, and this may reduce their overall assessment of similarity. If non-words are perceived to be more similar to each other than the words, then in order to discriminate those from each other, more direct retrieval may be engaged than in recognition of words, which could be discriminated on the basis of familiarity alone. Participants may attempt to retrieve more specific details, such as whether they have seen a particular letter combination in that order, and whether that combination was at the beginning, the middle, or the end of a non-word, instead of relying on familiarity, which may be less diagnostic.
Prior studies with words have shown that when targets and distractors are perceptually similar, direct retrieval strategies are often engaged during discrimination (e.g., Arndt & Reder, 2002; Rotello, Macmillan, & Van Tassel, 1999). Thus, we suspected that in order to observe directed forgetting in recognition, the test may need to engage more direct retrieval than has been typically captured by prior directed forgetting studies with words. Benjamin (2006) noted that familiarity-based recognition may be used when it can lead to acceptable levels of performance, and may be unaffected by directed forgetting, whereas retrieval-based recognition may be impaired. However, many prior directed forgetting studies with words have created situations that were conducive for familiarity-based recognition, and more direct retrieval conditions are likely needed to obtain directed forgetting in recognition.
If direct retrieval was contributing to the directed forgetting impairment with non-words, then we should be able to replicate these findings by using word stimuli and creating a test that places a greater demand on retrieval. One way to do this would be to manipulate the similarity of distractors to the targets. The presence of similar distractors on the test often elicits a recall-to-reject process, which capitalizes on the information about the recalled item to reject a similar distractor that could not be directly retrieved (e.g., Rotello et al., 1999). For example, if during encoding participants studied cloud, and on the recognition test were shown clouds, they might retrieve cloud from memory, and use it to reject clouds.
In Experiment 3, we required participants to discriminate the targets either from similar or dissimilar distractors. Similar distractors were plurality-reversed versions of the target words (Hintzman & Curran, 1994), whereas dissimilar distractors were completely new words. We hypothesized that the plurality discrimination task would engage more direct retrieval during the test, thereby enabling us to detect directed forgetting even with word stimuli.
96 UNCG undergraduates participated in this experiment in exchange for course credit. None of them had participated in Experiments 1 or 2. They were tested individually.
The stimuli were 60 common nouns (mean frequency, 53.7 per million; Kucera & Francis, 1967) and their plurals (e.g., bean – beans). The words are shown in the Appendix, Table A3. We chose only regular nouns so that their plural form could be created by adding an s, and so that the plural form would be of approximately the same word frequency as the singular form of the word. From the set of 60 singular-plural pairs, 40 pairs were used to create the study lists, whereas the remaining 20 pairs served as distractors in one of the conditions of the recognition test. Forty pairs were further divided into two sets of 20 pairs (set A and B). Each set was used to create two different study lists by selecting 10 singular items and 10 plural items from the word pairs in each set. Only a singular or a plural form of a word was selected from the word pair, but not both. The singular/plural aspect of the selected words was counterbalanced. Thus, there were a total of four study lists (A1, A2 and B1, B2) containing 20 different items, half of which were singular words, and half were plural words. Each list was assigned equally often to serve as the first or the second study list. If A1 or A2 served as the first study list, then the second study list was either a B1 or B2 list (or vice versa). Two different lists constructed from same set (e.g., A1 and A2) were never presented together. The presentation order of the words within each study list was randomized for each participant.
Memory was tested with an Old/New recognition test. There were two testing conditions, varied between-subjects. The plurality discrimination condition required participants to discriminate the studied words from their plurality-changed versions. The new discrimination condition required discriminating the studied words from the completely new distractors that were not presented during encoding.
Eight different test lists were constructed for each testing condition. In the plurality discrimination condition, the old words consisted of half of the items from each study list (5 singular and 5 plurals selected from each list). The plurality-changed versions of the remaining study list items were used as the distractors. Thus, the test lists contained 40 items, 20 of which were old words (10 from List 1 and 10 from List 2), and 20 were distractors that were plurality-changed versions of the remaining 10 words from List 1 and 10 words from List 2. In the new distractor condition, the test lists were created by the same principle, except that all plurality-changed distractors were replaced with completely new words (matched in singular/plural form to the plurality-changed distractors). The new distractors came from the set of 20 pairs that were not used during the study.
Thus, participants in both recognition conditions had the same study list (e.g., rivers, book, town, windows, etc.), but received a different test list depending on the condition. In the plurality discrimination condition, they were tested with rivers, books, town, window, whereas in the new distractor condition, they were tested with rivers, beans, town, napkin.
The design of the study was a Cue (forget vs. remember) × List (List 1 vs. List 2) × Discrimination Condition (plurality vs. new distractor) mixed factorial, with Lists as the only within-subjects factor.
The encoding phase was similar to the word condition of Experiment 2, except that the study lists always contained 20 items. All participants were told to pay attention to the plurality of each item during encoding because it was going to be important later on. For participants in the new distractor condition, this was a minor deception as it was not going to be helpful during the test. However it was implemented in order to keep the encoding conditions constant. No mention was made of how memory was going to be tested.
After studying List 2, participants received an Old/New recognition test. Half of them were given a plurality discrimination test list, whereas the remaining half received a new distractor test list. Participants were told to press OLD if they remembered studying the word on either lists (including the “practice” list, for the forget group participants), and to press NEW if they thought the word was not on either list. Participants in the plurality discrimination condition were told that some test items might be the plurality-changed versions of the study words, and that they had to respond positively only to the studied items and to reject their plurality-changed versions. They were told that only the singular or plural form of the word would appear on the test, but never both. Participants in the new discrimination condition were simply told to discriminate the studied items from unstudied items. An example was provided in both testing conditions.
The Hits and the False Alarms are presented in Table 4. Next, we report the analyses of the discrimination accuracy and response bias.
Using mixed ANOVA, we analyzed d' scores with Cue (forget vs. remember) × Discrimination Condition (plurality vs. new distractor) × Lists (List 1 vs. List 2) as factors. The results are summarized in Table 5. There was a main effect of Discrimination Condition, F(1, 92) = 49.05, MSe = 1.31, p < .001, indicating better accuracy in new discrimination condition (2.14) than plurality discrimination condition (.96). There was also a List × Cue interaction, F(1, 92) = 6.79, MSe = .313, p < .01, which was further modified by a significant 3-way interaction, F(1, 92) = 5.30, MSe = .313, p < .05. We followed-up this interaction with a separate List × Cue ANOVA in each test condition.
When participants had to discriminate the studied items from the new distractors, there were neither main effects, nor an interaction, all Fs < 1. In other words, there was no directed forgetting when the distractors were new items. In contrast, when participants had to make plurality discriminations, there were no main effects (both Fs < 1), but there was a significant List × Cue interaction, F(1, 46) = 8.29, MSe = .455, p < .01, implicating a directed forgetting effect. Follow-up tests revealed significant differences between the forget group and the remember group on List 1 items, t(46) = 2.03, p < .05. However, there were no differences on List 2 items between the forget group and the remember group, t < 1. To summarize, in the new distractor condition, there was no directed forgetting effect – neither the costs, nor the benefits were significant. In contrast, in the plurality discrimination condition, there were significant directed forgetting costs, but no benefits.
We analyzed the response bias statistic CL as a function of Cue and Discrimination Condition. Response bias was calculated from the pooled false alarms because there were no list-specific false alarms in the new distractor discrimination condition. There was only a main effect of discrimination condition, F(1, 92) = 61.97, MSe = .305, p < .001, indicating a more liberal response criterion in the plurality discrimination condition (.36) compared to the new distractor condition (1.25). Neither the main effect of cue, nor the interaction were significant, Fs < 1. In other words, directed forgetting did not influence the response bias.
In three experiments, we reported significant directed forgetting impairment in recognition accuracy by using either stimulus manipulations or distractor manipulations. In Experiments 1 and 2, we found significant differences between the forget and the remember groups in discriminability of non-words. No such impairment was observed with words in Experiment 2, or in prior work (Benjamin, 2006; Sahakyan & Delaney, 2005). Interestingly, these experiments reveal a double dissociation between stimulus type and the effects of directed forgetting: memory for non-words exhibits costs but no benefits, and memory for words exhibit benefits but no costs (the latter occurs only with longer lists; short lists produce neither the costs nor the benefits of directed forgetting). We believe that the difference in costs across stimuli reveals the greater effect of context on recognition of non-words (Russo et al., 1999) and the difference in benefits demonstrates the difficulty of engaging in elaborative encoding for stimuli that are devoid of meaning, such as non-words. However, it is the former finding that is of central interest here. The claim that circumstances promoting contextual effects in recognition contribute to the emergence of directed forgetting costs also finds support in the results of Experiment 3. In that study, the costs of directed forgetting were found even with words when the distractors were similar to the targets and required more fine-grained discrimination. The occurrence of directed forgetting impairment with words in Experiment 3 can be attributed solely to the processes solicited by testing, because all participants studied the same study list, but received a different test list requiring discrimination from plurality-reversed distractors or new distractors. Thus, the encoding conditions were kept constant, and any differences that emerged between experimental conditions must have been due to the retrieval processes.
Overall, conditions that rely on precise recovery of details from the study episode appear to promote the emergence of directed forgetting impairment in recognition. Other recent findings provide additional support for this hypothesis. For example, Loft, Humphreys, and Whitney (2008) demonstrated that intentionally forgotten materials interfere more with ongoing task performance under exclusion than inclusion conditions. Also, Gottlob and Golding (2007) found directed forgetting in source monitoring, where in addition to identifying the Old/New status of the words, participants had to retrieve their case/color/list membership. In their study, directed forgetting affected memory for item details (e.g., color/case) more consistently than memory for the item itself. Taken together, these findings underscore the claim that engaging in more effortful retrieval processes at the time of test contributes to detecting directed forgetting impairment in recognition. This is true both when attempting to retrieve the details about the item characteristics (such as their appearance), as well as in the exclusion instruction condition, where participants have to have to make a fine-grained discrimination at test by avoiding responding to the words from a certain list.
A number of related mechanisms could underlie the directed forgetting impairment with non-words, and they all stem from the consideration of contextual factors. For example, non-words could have relied more on context because they have no alternative bases for recognition. Because inter-item associations are harder to establish with non-words than words, contextual cues could play a larger role for recognition of non-words. Also, non-words could have formed stronger item-to-context associations during encoding because they have no prior history of occurrence in other contexts. Reduced competition from prior contexts could lead to richer encoding of contextual information in the memory trace of non-words compared to words. One potential criticism for such interpretation is that non-words typically do not show superior memory for contextual details such as ink color, font type, or screen location compared to words (Mulligan, Lozito, & Rosner, 2006). It is important to distinguish the cuing property of context from the ability to retrieve the contextual details, and it is the latter that was examined in Mulligan et al. (2006) study. Thus, one might form stronger item-to-context associations during encoding and benefit from the cuing of those associations during retrieval, but do not necessarily demonstrate enhanced ability to retrieve contextual details when asked to specify the origin of remembered items. Furthermore, Mulligan et al. (2006) study could be described as examining memory for the local context (i.e., changes that occur with the presentation of each item) rather than the global context (i.e., list episode), and the two types contexts may be encoded differently. Whether processing of non-words establishes stronger item-to-context associations, or merely enhances the importance of contextual cues in recognition (because of the absence of better retrieval cues) remains to be further examined. However, both explanations can handle the larger directed forgetting impairment with non-words than with words.
Finally, directed forgetting impairment with non-words could emerge because of more direct retrieval processes involved in recognition test. Retrieval-based recognition has been shown to be more susceptible to context effects than familiarity-based recognition (Macken, 2002). If recognition of non-words more actively engaged retrieval processes, then this would explain larger directed forgetting with non-words than words. One criticism of such an interpretation is the claim that non-words may actually elicit less recollection than words. For example, non-words typically produce more Know responses and fewer Remember responses compared to words in the Remember/Know procedure (e.g., Gardiner & Java, 1990; Greene, 2004; Macken, 2002; Rajaram, Hamilton, & Bolton, 2002). However, the present results demonstrate different response criteria for the recognition of words and non-words, and such criteria are known to also affect the rate of Remember responses (Benjamin, 2005; Hirshman & Hentzler, 1998). In other words, an alternative interpretation of the lower rate of Remember responses with non-words is that it reflects a response bias rather than a deficit in recollection (e.g., Dunn, 2004).
Although our primary focus in this paper was with the costs of directed forgetting, we briefly discuss the directed forgetting benefits, which emerged only with long lists of words in Experiment 2. They did not emerge with short word lists, or with non-words. According to the strategy-change account, directed forgetting benefits reflect a strategic encoding advantage in the forget group. Use of non-associative stimuli likely precluded the application of better encoding strategies, which can explain the absence of benefits with non-words. Words, on the other hand, can benefit from organizational encoding, but the benefit of such encoding was only evident with longer lists. These results replicate prior reports of directed forgetting benefits, which were found only with the longer lists (Benjamin, 2006; Sahakyan & Delaney, 2005), and imply that the list-method directed forgetting benefit in recognition is mediated by the list length. Compared to the short word lists, long lists in Experiment 2 hurt overall memory in the remember group more than in the forget group, indicating a greater list-length effect in the remember group than in the forget group. This could occur because increasing the length of the list impaired recollection (e.g., Cary & Reder, 2003; Yonelinas & Jacoby, 1994), but more elaborate encoding of List 2 items in the forget group provided support for recollection and offset the list-length effect, allowing for the benefits to emerge. Shorter lists, on the other hand, may have placed less demand on recollection, and familiarity-based recognition provided sufficient basis for discrimination. Therefore, with short study lists, performance in the remember group was in a range that masked the benefits in the forget group. The absence of benefits in Experiment 3 could either be linked to the shorter lists being employed in that study, or to the encoding instructions that emphasized to participants to pay attention to the plural/singular status of the words. We employed such instructions because prior research showed that drawing attention to the plurality aspect during encoding more extensively engages retrieval-based recognition compared to the standard instructions (e.g., Rotello et al. 1999). However, such encoding demands may have precluded successful application of better encoding strategies because participants' resources were taxed as they were dual-tasking. These questions require further investigation and are beyond the scope of the current paper.
In conclusion, our studies have provided the first empirical demonstration that directed forgetting costs can be reliably obtained in recognition when stimuli are used, or tests are employed that enhance the utilization of contextual information. We hope that our findings will encourage more active investigation of this phenomenon, which has long eluded the field.
Author Note Lili Sahakyan, Department of Psychology, University of North Carolina-Greensboro. Emily R. Waldum, Department of Psychology, University of North Carolina-Greensboro. Aaron S. Benjamin, Department of Psychology, University of Illinois, Urbana-Champaign. Samuel P. Bickett, Department of Psychology, University of North Carolina-Greensboro.