|Home | About | Journals | Submit | Contact Us | Français|
We tested the hypothesis that a verbal coding mechanism is necessarily engaged by object, but not spatial, visual working memory tasks. We employed a dual-task procedure that paired n-back working memory tasks with domain-specific distractor trials inserted into each interstimulus interval of the n-back tasks. In two experiments, object n-back performance demonstrated greater sensitivity to verbal distraction, whereas spatial n-back performance demonstrated greater sensitivity to motion distraction. Visual object and spatial working memory may differ fundamentally in that the mnemonic representation of featural characteristics of objects incorporates a verbal (perhaps semantic) code, whereas the mnemonic representation of the location of objects does not. Thus, the processes supporting working memory for these two types of information may differ in more ways than those dictated by the “what/where” organization of the visual system, a fact more easily reconciled with a component process than a memory systems account of working memory function.
Appreciation of the functional organization of the mammalian visual system (Ungerleider & Haxby, 1994; Ungerleider & Mishkin, 1982) has led to the widely accepted view that working and short-term memory1 for objects (“what”) and locations (“where”) are computed by at least partially discrete neural systems in monkeys (Wilson, O’Scalaidhe, & Goldman-Rakic, 1993) and humans (e.g., Courtney, Ungerleider, Keil, & Haxby, 1996; McCarthy et al., 1996; Mecklinger & Muller, 1996). And behavioral studies in humans confirm that working memory for these two domains of information are supported by distinct visually based mental codes representing object identity and location, respectively (e.g., Della Sala, Gray, Baddeley, Allamano, & Wilson, 1999; Hecker & Mapperson, 1997; Smith et al., 1995; Tresch, Sinnamon, & Seamon, 1993). The purpose of the experiments presented in this report, however, was to explore whether spatial and object working memory may also differ along an axis orthogonal to that dictated by the functional organization of the visual system: Object working memory may automatically, obligatorily engage verbal coding mechanisms, whereas spatial working memory may not.
This theoretical proposition arose from our experience with spatial/object manipulations in memory tasks—delayed recognition (Postle & D’Esposito, 1999; Postle, Jonides, Smith, Corkin, & Growdon, 1997), conditional-associative learning (Postle, Locascio, Corkin, & Growdon, 1997), and the n-back task (Postle, Stern, Rosen, & Corkin, 2000). This research has indicated that participants often adopt a strategy of verbally encoding stimuli in the object condition, although our tasks have used relatively “nonverbalizable” Attneave shapes. Participants in these studies of object memory were never instructed to use verbal coding to mediate performance on the tests, yet post-test debriefing suggested that the tendency to do so was strong and was consistent across age groups (from late teens to 80s), neurological status (healthy, Parkinson’s disease, stroke, and medial temporal-lobe amnesia), and testing environment (behavioral laboratory and fMRI laboratory). We had not observed a similar tendency in tests of spatial memory that were procedurally identical to the object tasks and that were administered in the same session.
The idea that the representations of objects in working memory include a semantic code can be seen as an extension of the simultaneous multiple encoding theory of Wickens (1972 Wickens (1973), which holds that words can be encoded according to their semantic attributes, the physical characteristics of their presentation at the time of encoding, and other attributes (such as language, frequency, representing symbol, and imageability). Thus, Wickens argued that words are encoded not only according to their central function—the conveyance of meaning—but according to attributes and contextual factors that can vary independent of a word’s semantic content. We, in turn, are proposing that the mnemonic representation of visually presented object stimuli may encode not only the visual features of the object (e.g., size, color, texture, shape), but also verbal information associated by the individual with the visual stimulus. Furthermore, we propose that this association occurs automatically, such that a verbal code is an inherent part of the working memory representation of a visually presented object. Initially, we could not specify whether this code is semantic or only lexical (as would be the case with a name devoid of meaning), although the results of the present study helped us to sharpen this aspect of our claim. Furthermore, our empirical observations led us to assert that the automaticity of verbal coding does not extend to all visually presented information. Specifically, we posited that it does not apply to the mnemonic representation of locations in space.
The theoretical and empirical literatures germane to the mnemonic coding of visually presented objects and locations, however, are incomplete. It is widely assumed that locations in extrapersonal space are represented mentally in a nonverbal analog code (Attneave, 1972). Posner reported (Posner & Konick, 1966) and replicated (Posner, 1967) data consistent with an important role for rehearsal in spatial working memory: Reproduction of a visually guided movement showed minimal forgetting across an unfilled 20-sec delay interval but substantial forgetting when the delay was interpolated with a digit classification task. Posner (1967) posited that the term rehearsal need not be restricted to verbal processing but could apply to any case in which information retention required central processing capacity. Posner and Konick emphasized introspective reports of participants in concluding that the memory code employed to remember a spatial location was largely nonverbal, but they did not test this idea directly.
Familiar objects can be represented in many different codes. For example, tests of comparison of serially presented representational object stimuli (e.g., line drawings of buildings and photographs of cars) have revealed evidence for multiple levels of representational codes, including a visual object code and a nonvisual semantic code that may or may not be verbal (Bartram, 1976). Other work has suggested that this semantic code is, in fact, verbally mediated and that participants’ expectations about the nature of a probe stimulus (either a picture of or the name for a well-learned schematic face) in a delayed-recognition test was a more important determinant of the accessibility of the verbal code versus the pictorial code associated with an item than was the modality (verbal or pictorial) in which the target stimulus had been presented (Tversky, 1969).
Most relevant to the present question are investigations using abstract object stimuli that do not inherently represent objects in the real world. Cermak (1977) demonstrated that (13-sec) delayed-comparison memory for abstract outline figures was sensitive to the ease with which target and probe stimuli could be interpreted as representing the same objects (e.g., a hippopotamus head or a human face). Although these results demonstrated that semantic interpretation of an object can have an important influence on working memory, they did not address whether the use of semantic codes is fundamental to the short-term retention of information about objects or whether it is strategic. This is because Cermak’s experimental procedures deliberately induced subjects to establish semantic codes for studied items.
More recently, a study by Simons (1996) directly compared working memory for spatial or object characteristics of complex pictures with those of objects in an array. Simons reported that object memory can be considerably worse than spatial memory, and that only object memory was sensitive to experimental manipulations intended to block verbal labeling. On the basis of these results, Simons proposed that memory for objects and for spatial layout are mediated by fundamentally different mechanisms, with successful working memory for objects requiring verbal encoding, but that for spatial layouts requiring automatic encoding without verbal mediation. Simons’s data, however, do not satisfactorily address the present question concerning the role of verbalization in object versus spatial working memory. The problem is that Simons’s data contain a difficulty confound: Performance in object conditions was significantly lower than that in spatial conditions in each of his experiments. His data, therefore, are equally consistent with the alternative view that verbalization strategies in working memory are dependent on task difficulty rather than on stimulus material.
The experiments presented here were designed to test conclusively two related hypotheses: that a verbal code is fundamental to the representation of objects in working memory and that the representation of locations in working memory does not require a verbal code. We used a dual-task procedure for these studies, in which a primary working memory task was performed in parallel with a series of distractor trials. The logic was that interactions between distractors featuring different domains of information (e.g., motion vs. words) and working memory tasks also containing different stimulus information domains would reveal some of the mental codes that support spatial and object working memory performance (Crowder, 1993; Posner, 1978).
We used the n-back task, a continuous performance working memory task in which participants view the serial presentation of stimuli and judge whether each is a repetition of the stimulus that appeared n stimuli previously. For example, in a two-back test, the third item of the series A B A is a two-back match, whereas the third item of the series C B A is not. This task is believed to engage several mental operations, including: (1) encoding a stimulus into working memory; (2) maintenance of this mnemonic representation despite the subsequent presentation of additional interfering, attentionally salient stimuli; (3) shifting attention back to this mnemonic representation when necessitated by task contingencies; (4) discriminating between this mnemonic representation and the stimulus on the screen, and guiding behavior with the outcome of this discrimination; and (5) retagging each still-relevant mnemonic representation with a new positional code to reflect the updating of the contents of working memory that must happen with the appearance of each new stimulus (Jonides et al., 1997; Postle et al., 2000).
To simplify the interpretation of predicted primary task–secondary task interactions, we designed this version of the procedure in an interleaved manner such that each distraction trial occurred during the interstimulus interval (ISI) of the n-back task. That is, the distractor onset occurred after the offset of the preceding n-back stimulus, and the distractor offset occurred prior to the onset of the next n-back stimulus. In this way, we sought to limit the effects of distractors on processes (2) and (5) from the previous paragraph. Thus, any interactions between distractor-task and n-back performance could be ascribed primarily to effects on the maintenance and/or control of working memory representations2, as opposed to encoding-related or response-related processes.
Our design employed the logic of dual-task interference that if a secondary task interferes selectively with performance on primary task A as compared with primary task B, one can infer that primary task A and the secondary task draw on common cognitive resources (Crowder, 1993; Posner, 1978). The primary task in this experiment, the n-back working memory task (Cohen et al., 1994; Gevins & Cutillo, 1993), varied according to two factors—stimulus domain (object vs. spatial) and difficulty (easy vs. difficult). The factor of stimulus domain embodied our theoretical proposition. The factor of difficulty permitted us to sort out the relative importance of stimulus domain versus difficulty in the predicted pattern of primary task–secondary task interactions.
The distractor tasks were administered as discrete trials that occurred during each ISI of the block of n-back stimuli. They were yes/no discrimination tasks that required syntactic judgments of words or attentional tracking of moving stimuli (the former designed to tax verbal resources, the latter to tax “dorsal stream” visual processing resources). We predicted that stimulus domain and distractor type would interact, reflecting a greater effect of verbal distraction on object than on spatial n-back performance, and the converse for motion distraction, regardless of the level of difficulty of the n-back task.
We recruited 45 participants, ranging in age from 18 to 30, from the University of Wisconsin–Madison community. However, nearly half of them were unable to perform the verbal distraction task above chance level and thus were excluded from the study. Twenty-four participated in the experiment.
Testing was conducted on Macintosh computers with programming in MacTCL. The n-back task consisted of serial presentation of stimuli (2-sec exposure duration), with the participants pressing one of two keys in response to each stimulus, indicating whether each stimulus was a repetition of the stimulus that appeared n stimuli previously. The “yes” key was depressed with the right hand and the “no” key with the left. The ISI separating n-back stimuli was 6 sec, and there were 40 n-back stimulus presentations per block (yielding a block duration of 5 min, 14 sec). The shapes used in the object n-back tests were those from the set introduced by Attneave and Arnoult (1956) that had been rated with the lowest association value in normative testing of a large set of the shapes (i.e., fewer than 30% of the participants could associate them with a real shape; Vanderplas & Garvin, 1959). In the object n-back task, any of nine possible Attneave shapes that each looked distinctive appeared at the center of the screen; each shape was used an average of 4.4 times in each block. In the spatial n-back tasks, the stimuli were identical black circles appearing in any of nine possible locations on the screen (the nine equally sized quadrants that would be created by dividing the screen with two vertical and two horizontal lines); all locations were used an average of 4.4 times in each block.
Difficulty was operationalized as the n that was selected for the n-back task. It is generally assumed that varying the n in this task effects parametric changes in the difficulty, or “load,” of this task, rather than fundamental changes in the kinds of mental processes recruited by the task. Evidence for this assumption includes the fact that reaction time (RT) varies linearly as a function of n in the n-back task (Braver et al., 1997), just as it varies as a function of load in the item-recognition task (Sternberg, 1966). Similarly, neuro-imaging signal in several subregions of prefrontal and parietal cortex has also been shown to vary monotonically as a function of load in the n-back task (Braver et al., 1997; Cohen et al., 1997; Jonides et al., 1997), and these brain regions have been shown in several (independent) neuropsychological and neuroimaging studies to contribute to the maintenance of information in working memory.
Object working memory was tested in blocks of one-back and two-back trials and spatial working memory in blocks of two-back and three-back trials. Pilot studies indicated that object n-back performance is inferior to spatial n-back performance at any given n, and thus we selected different ns for tests employing the two different stimulus domains in an effort to equate the absolute level of difficulty between them. For each block of n-back stimuli, there were 8 stimuli requiring a “yes” response and 32 requiring a “no” response. The sequence of stimuli and the position of “yes” trials within each block were determined pseudorandomly, with the constraint that no two-back or three-back matches occurred in one-back blocks, no one-back or three-back matches in two-back blocks, and no one-back or two-back matches in three-back blocks. N-back performance was converted to d′ for analysis.
There were three levels of distraction: motion, verbal, and no-distraction. Only 1 distractor task occurred during each n-back ISI, so that 39 distractor tasks were performed during each block that paired primary and secondary task performance. The distractor tasks were designed such that their trials did not span the entire length of the n-back ISI, and such that their response mode was verbal. This was done to simplify the interpretation of any interference that we might observe between distractor and working memory task, because that could be interpreted as competition for common resources at the level of the maintenance of central representations, rather than at the level of stimulus encoding or of preparation for motor output. Such an interpretation would map most directly onto our theoretical proposition about the nature of the mental codes representing information in working memory. The dependent measure of each distractor task was accuracy.
Motion distractor trials (modified from He, Cavanagh, & Intriligator, 1996) were designed to tie up processing resources of the dorsal visual stream (i.e., the “where” system). These trials began with seven circles of identical size arranged in a distinctive array, with two of the circles (randomly selected) colored red and the remaining five colored gray. After an initial presentation of 500 msec, the two red circles changed color to gray, and the seven circles moved about the screen in random directions at a speed of 9.2 cm/sec. The circles stopped moving 1.5 sec after trial onset, and two of the gray circles changed color to red; the probability that these two circles would be the same as the two that were highlighted at the beginning of the trial was .5. The participants responded verbally “same” or “diff” to indicate whether the two highlighted circles matched those that had been highlighted at the beginning of the trial. (The participants were trained prior to the experiment to use the truncation “diff” for different, so that the verbal responses for match and nonmatch trials would contain the same number of syllables.)
Verbal distractor trials were designed to tie up syntactic, and perhaps semantic, resources that are engaged when one assigns a verbal label to an object. These trials presented an abstract word, and the participants indicated verbally whether the word was a “noun” or an “ad.” (The participants were trained prior to the experiment to use the truncation “ad” for adjective.) Successful performance of this syntactic category judgment task required access to syntactic knowledge and may have also entailed activation of semantic information about word stimuli. Abstract word stimuli ranged in frequency from 0 to 33 occurrences per million (Kuçera & Francis, 1967) and were judged to be abstract by the investigators. Examples of stimuli include wile, terse, and limpid. (Although the distractor tasks was also expected to create articulatory suppression, there were no obvious disparities in the extent of suppression because each required a single monosyllabic response on each trial.)
Testing was performed in 14 blocks, corresponding to all possible n-back/distraction combinations (including “no-distraction” blocks in which one block of n-back was performed with each type of stimuli and at each level of difficulty and “no-memory” blocks in which 39-trial blocks of distraction were performed alone). Partial counterbalancing was achieved by constructing 14 × 14 Latin square tables that specified 14 different block orders for testing. Each block appeared once in the first position of a row, and each participant was tested in the order specified by the row corresponding to that participant’s order of recruitment. The order of testing blocks in the first row of each table was determined randomly, and a new table was constructed (with a new randomly determined first-row order) each time a table was filled. Analyses performed on pilot data confirmed that this procedure prevented possible contamination of our results with practice or order effects.
We trained the participants on each condition of the working memory and distractor tasks, in individual and dual-task presentations, prior to data collection. Training consisted, minimally, of performance of one complete block of each working memory and distractor task alone, plus simultaneous performance of one block of working memory plus distraction. (The working memory condition and distractor type used for training on the dual-task procedure were selected arbitrarily.) Training blocks were repeated if participants expressed or displayed difficulty understanding or executing instructions. The vast majority of participants required only a single exposure to each training block before they were ready to proceed to the experiment. Testing forms used for training were not used during data collection.
Our 3 × 2 × 3 repeated measures design crossed n-back stimulus domain (spatial, object, no memory) with n-back difficulty (easy, difficult) with distractor type (motion, verbal, no distraction) for a total of 14 testing blocks per subject (the difficulty factor did not apply to no-memory blocks; the no-memory/no-distractor cell was empty). n-back difficulty was operationalized as one-back (“easy”) and two-back (“difficult”) for object working memory and as two-back (“easy”) and three-back (“difficult”) for spatial working memory.
The results indicated that the two working memory tasks were roughly equated for difficulty (Figure 1). The distraction results, in contrast, revealed that the verbal distraction task was markedly more difficult than the motion distraction task (Table 1). An omnibus 2 (n-back stimulus domain) × 2 (n-back difficulty) × 3 (distraction task) analysis of variance (ANOVA) of working memory-task performance revealed a borderline main effect of n-back stimulus domain [F(1,23) = 3.43, MSe = 4.18, p = .08], and reliable main effects of difficulty [F(1,23) = 38.35, MSe = 4.16, p < .0001] and of distraction [F(2,46) = 5.38, MSe = 4.28, p < .0001]. The only significant interaction was that of n-back stimulus domain × distraction [F(2,46) = 3.36, MSe = 2.51, p < .05; all unreported Fs ≤ 1.0].
We followed up the omnibus ANOVA with a 2 (stimulus domain) × 2 (difficulty) × 2 (distraction) ANOVA that only considered the two theoretically motivated levels of distraction: motion and verbal. This ANOVA found no main effect of n-back stimulus domain [F(1,23) = 2.06, MSe = 3.50, n.s.], but reliable main effects of difficulty [F(1,23) = 21.87, MSe = 4.15, p < .0001] and of distraction [F(1,23) = 7.16, MSe = 2.80, p < .05], and just one significant interaction, that of n-back stimulus domain × distraction [F(1,23) = 7.26, MSe = 2.25, p < .05]. This interaction reflected the fact that the effect of concurrent verbal distraction was markedly greater on object n-back performance than on spatial n-back performance, a result consistent with our prediction that object n-back performance would be disproportionately sensitive to verbal distraction. We did not, however, find evidence for the other half of the predicted crossover interaction, that spatial n-back performance would be disproportionately sensitive to motion distraction. The absence of this predicted effect might be due, in part, to a tradeoff of secondary task performance for primary task performance, as suggested by the analysis of distractor task performance that is reported in the following paragraph.
Inspection of distractor performance (Table 1) confirmed that the verbal distractor task was, indeed, markedly more difficult than the motion distractor task. The three-factor ANOVA revealed a main effect of distractor [F(1,23) = 83.43, MSe = 45.42, p < .0001]. No other main effects or interactions achieved significance. Despite the absence of a three-way interaction [F(1,23) = 2.64, MSe = 6.86, p = .12], we performed post hoc 2 (n-back stimulus domain) × 2 (n-back difficulty) ANOVAs of performance with each distractor task to explore the possibility of a secondary task–primary task tradeoff, as was suggested by inspection of the data in Table 1. The ANOVA of motion distractor task performance revealed main effects of n-back stimulus domain [F(1,23) = 10.10, MSe = 1.06, p < .005] and of n-back difficulty [F(1,23) = 8.30, MSe = 1.82, p < .01], and an inter-action between the two [F(1,23) = 4.50, MSe = 2.07, p < .05]. This interaction reflected the fact that motion distractor performance was sensitive to the difficulty of the concurrent spatial n-back task, but not to the difficulty of the concurrent object n-back task. The analogous ANOVA of verbal distractor task performance, in contrast, revealed no significant effects.
The results of Experiment 1 indicated that object n-back performance was markedly more sensitive to verbal than to motion distraction, a pattern that differed from that seen with spatial n-back performance. Whether they also displayed the converse effect is unclear, because the apparent lack of a selective effect of motion distraction on spatial n-back performance may have been achieved at the expense of a tradeoff with motion distractor task performance. Thus, the results were consistent with our prediction that object n-back performance would be more sensitive to verbal than to motion distraction, and they were equivocal with respect to the selective sensitivity of spatial n-back performance. (Note that the latter effect was not of great theoretical import to our study, because it is well established that dorsal stream visual distraction can disrupt working memory for locations [e.g., Hecker & Mapperson, 1997; Tresch et al., 1993]. Rather, greater disruption of spatial working memory performance by the motion distractor task would rule out a difficulty explanation of the effect of verbal distraction on object working memory.) Thus, these results offer preliminary evidence that is consistent with our proposition that object working memory engages verbal coding mechanisms, whereas spatial working memory does not.
Stronger inference, however, requires us to resolve several empirical shortcomings of Experiment 1. First, although we were able to avoid the confound of disparate levels of difficulty in the working memory tasks (a concern in interpreting the results of Simons, 1996), we introduced a new difficulty confound: that of disparate levels of difficulty in our two distractor tasks. Second (and perhaps relatedly), the second half of the predicted crossover interaction—that spatial n-back performance would be more sensitive to motion than to verbal distraction—was only realized in the tradeoff of motion distractor task performance for spatial n-back performance. Third (and most certainly relatedly), the demands of the verbal distraction task resulted in the exclusion of nearly half of the participants who had been randomly selected for this study, a fact that may have introduced a selection bias as well as bias resulting from incomplete counterbalancing. Finally, the effect of n-back difficulty on our results was unclear: Whereas the absence of any reliable interactions with the factor of n-back difficulty indicates that such effects were not detectable in the working memory data, n-back difficulty clearly did interact with distractor task performance. For all these reasons, Experiment 2 was designed to assess the replicability of the qualitative pattern of results from Experiment 1 but to avoid its confounds.
This experiment replicated the procedure of Experiment 1 with the exception of the verbal distractor task, which was modified in two ways. First, it was designed to be easier, thereby bringing its difficulty level closer to that of the motion distractor task. (The goal of a distractor task in dual-task interference designs is to tie up resources of a putative mental code, but to avoid, as much as possible, increasing difficulty in a nonspecific manner. Thus, for Experiment 2, we opted to make the verbal distractor task easier, instead of making the motion distractor task more difficult.) Second, the verbal distractor task required semantic, rather than syntactic, processing, thereby permitting more precise theoretical interpretation of the selective interference that it may produce. Logically, the disruptive effects of the verbal distractor task from Experiment 1, by requiring a syntactic comparison, could have been at the level of a syntactic code, a semantic code, or both. As we reconsidered the design of this distractor task for the present experiment, however, we reasoned that the a priori likelihood that mnemonic representations of objects incorporate a syntactic code was low, particularly in view of evidence that even word stimuli are not encoded into working memory according to their grammatical characteristics (Goggin, 1974; Wickens, 1973). Thus, the redesign of the verbal distraction task (detailed below) permitted us to refine our original theoretical proposition by specifying that one way in which visually presented objects are represented in working memory is with a semantic code.
Seventy adults ranging in age from 18 to 30, all from the University of Wisconsin–Madison community, participated in this experiment.
The apparatus and tasks were identical to those used in Experiment 1, with the exception of modifications to the verbal distractor task. Each trial in the modified verbal distractor task presented two nouns simultaneously, and the participants were instructed to make an animacy judgment (i.e., “living” or “nonliving”) about each and to say aloud whether the two were the “same” or “diff,” with respect to animacy. (Thus, a trial presenting sentiment and hound required a response of “diff,” whereas trials presenting fork and ignorance or lobster and boss required a response of “same.”) The nouns had a written frequency ranging from 20–40 per million (Kuçera & Francis, 1967).
Counterbalancing, training, and testing procedures were all the same as in Experiment 1.
Again, the 3 × 2 × 3 repeated measures design crossed n-back test (spatial, object, no memory) with difficulty (easy, difficult) and distractor type (motion, verbal, no distraction) for a total of 14 testing blocks per subject. Difficulty was operationalized as one-back and two-back for object working memory, and as two-back and three-back for spatial working memory. Analysis strategies and predictions were the same as in Experiment 1.
The results indicated that the two working memory tasks were again of roughly equivalent difficulty (Figure 2). Unlike Experiment 1, however, the difficulty level of the verbal distractor task was close to (although still significantly lower than) that of the motion distractor task (Table 2). The 2 (stimulus domain) × 2 (difficulty) × 3 (distraction) omnibus ANOVA revealed a borderline main effect of n-back stimulus domain [F(1,69) = 3.19, MSe = 7.97, p = .08], reliable main effects of difficulty [F(1,69) = 194.99, MSe = 4.16, p < .0001] and of distraction [F(1,69) = 75.79, MSe = 4.84, p < .0001], and just one significant interaction, that of n-back stimulus domain × n-back difficulty [F(1,69) = 6.69, MSe = 3.72, p =.01; all unreported Fs < 2.5].
The 2 (stimulus domain) × 2 (difficulty) × 2 (distraction) ANOVA, which focused on the two levels of distraction that were theoretically motivated (motion vs. verbal) revealed no main effect of stimulus domain [F(1,69) = 0.90, MSe = 7.43, n.s.] or of distraction [F(1,69) = 0.72, MSe = 4.06, n.s.], a reliable main effect of difficulty [F(1,69) = 195.16, MSe = 3.07, p < .0001], a significant interaction of n-back stimulus domain × n-back difficulty [F(1,69) = 9.20, MSe = 2.96, p < .005], and a significant interaction of n-back stimulus domain × distraction [F(1,69) = 4.24, MSe = 2.68, p < .05]. The latter interaction was of principal theoretical importance, because it indexed differential effects of the two distractors on the two primary tasks. (Post hoc pairwise comparisons performed to assess this interaction in detail indicated that the effect on spatial n-back of motion distraction was not greater than the effect of verbal distraction [t(69) = .71, n.s.], and that the effect on object n-back of verbal distraction was not greater than that of motion distraction [t(69) = 1.82, p = .07].) Returning to the 2 × 2 × 2 ANOVA, neither the interaction of n-back difficulty × distraction [F(1,69) = 2.01, MSe = 3.58, n.s.] nor the three-way interaction [F(1,69) = 0.03, MSe = 4.34, n.s.] were significant.
The 2 (distraction task) × 2 (n-back stimulus domain) × 2 (n-back difficulty) ANOVA assessing distractor performance revealed reliable main effects of distractor [F(1,69) = 36.80, MSe = 0.004, p < .0001], n-back stimulus domain [F(1,69) = 14.56, MSe = 0.001, p < .0001], and n-back difficulty [F(1,69) = 10.92, MSe = 0.002, p < .0001], but no significant interactions.
The results of Experiment 2 were, again, consistent with our prediction, in that object n-back performance displayed greater sensitivity to semantic distraction, whereas spatial n-back performance displayed greater sensitivity to motion distraction. (Note that the latter effect was only observed in the difficult condition, perhaps because the easy condition left sufficient resources unoccupied, making it more sensitive to the nonspecific difficulty of the distractor task than to the mental code that it tapped.) Importantly for our theoretical claim, the n-back stimulus domain × distraction interaction took the form of a crossover interaction, thereby making its interpretation more straightforward than was the case in Experiment 1. With regard to difficulty, the absence of interactions between the factors of n-back difficulty and distraction indicated that this pattern of domain-specific interference effects did not vary as a function of working memory task difficulty.
Distractor task performance was also easier to interpret in Experiment 2. Perhaps most salient was that fact that, unlike in Experiment 1, no participant was dropped from the data set due to unsatisfactory performance of a distractor task. Also important for the interpretation of the results was the fact that there was no evidence of primary–secondary task tradeoffs. The apparent ceiling effect of the motion distraction task would only present a problem if there were no evidence that this task disrupted performance on the primary task, a situation that did not occur. There did remain an overall difference in the difficulty of the verbal versus the motion distractor tasks. As mentioned above, this may account for the quantitatively greater effect of verbal-than-motion distraction on spatial two-back performance. But since this disparity in difficulty did not override the interaction of distractor task with n-back stimulus domain, we can assume that each distractor task occupied a comparable amount of resources (either semantic or visual-attentional) during its performance with each of the primary memory tasks.
Our results extend those of Simons (1996) in three ways. First, they confirm the greater sensitivity of object than of spatial working memory performance to manipulations of verbal processing under conditions that rule out the confounding factor of working memory task difficulty. In both experiments, the relative effects of verbal (vs. motion) distraction on spatial n-back performance decreased as difficulty increased. They are, therefore, consistent with the hypothesis that a verbal code makes an important contribution to the retention of object identity information in working memory. Second, our results specify that within the category of verbal processing, a semantic code contributes to object working memory. Third, they establish that the dependence of object working memory on a semantic code can be localized to maintenance-related processes. (We assume that this semantic code is initially associated with the stimulus representation at the time of encoding [Potter, 1993], although our data do not address this possibility directly.)
The demonstration that working memory for objects depends on semantic, as well as visual, codes, can be seen as an extension of the multiple encoding model of Wickens (1973). Whereas this model asserts that word stimuli are encoded not only according to their semantic and lexical attributes, but also according to a wide array of contextual information that is unrelated to their linguistic content, the present data (and those of Simons, 1996) suggest that memory for objects may necessarily incorporate a semantic component, regardless of the seeming abstractness of those objects, or of the demands of the task. If semantic coding is indeed an obligatory aspect of object working memory, one implication of these conclusions is that models of working memory that posit that different domains of information are processed by different systems (e.g., an articulatory loop vs. a visuo-spatial scratch pad, as in the multiple-component model [Baddeley & Hitch, 1974; Baddeley & Logie, 1999; Logie, 1995]) may need to be modified to acknowledge the multiplicity of representational codes that can represent visual information in parallel.
The temporal localization of the verbal distraction effect to the delay period of the working memory task suggests at least two important directions for future inquiry. First, are nonmaintenance-related working memory processes (i.e., those associated with encoding and response) also dependent on a verbal code? And second, which of the many theoretically dissociable maintenance-related processes (e.g., rehearsal, storage, and the control processes that govern them [Kieras, Meyer, Mueller, & Seymour, 1999], redintegration [Schweickert, Guentert, & Hersberger, 1990], maintenance of ordinal position [Henson, 1999], attention shifting [Garavan, 1998], central executive function [Baddeley, 1986; Cowan et al., 1998]) are sensitive to verbal distraction? The answers to these questions may have implications that extend beyond working memory and that also inform our understanding of the visual recognition and representation of objects.
The results presented here are consistent with the hypothesis that spatial and object working memory differ fundamentally in that only object working memory depends on verbal mediation (Simons, 1996). We believe that our results are best understood from the perspective that the representation of knowledge about objects overlaps (computationally and neurally) considerably with the mechanisms responsible for sensory perception (Chao, Haxby, & Martin, 1999; Farah & McClelland, 1991; Thompson-Schill, Aguirre, D’Esposito, & Farah, 1999). From this perspective, our results arise necessarily from the fact that visual perception of objects automatically recruits semantic knowledge about related objects.
This research was supported by Grants AG 06605, NS 01762, AG 13483, MH 064498, and a Vilas Young Investigator award (University of Wisconsin–Madison). We thank Mary Potter and Sharon Thompson-Schill for helpful discussions of this research, Mark Snow for programming assistance, Joseph Locascio for statistical consulting, and Wendi Benalt, Linda Kim, Matias Klein, Bridget Lam, Jessica Lease, Hwamee Oh, and Mark Silverman for data collection assistance.
1The view that we are advancing about the short-term retention of information does not require us to endorse theoretical models that either do or do not distinguish between short-term memory and working memory (e.g., Miyake & Shah, 1999). For simplicity, and because the task used in the studies described here is unambiguously a working memory task, we will refer to the short-term retention of information as working memory throughout this article.
2Our methods did not permit us to distinguish among the multiple mechanisms that support the maintenance of information in working memory (e.g., storage and rehearsal; Kieras et al., 1999), nor to dissociate strictly maintenance-related processes from the executive control-related functions also implicated in n-back performance (e.g., shifting attention and updating positional codes).
BRADLEY R. POSTLE, University of Wisconsin, Madison, Wisconsin.
MARK D’ESPOSITO, University of California, Berkeley, California and.
SUZANNE CORKIN, Massachusetts Institute of Technology, Cambridge, Massachusetts.