|Home | About | Journals | Submit | Contact Us | Français|
Short-term memory (STM), or the ability to hold information in mind for a few seconds, is thought to be limited in its capacity to about 7 ± 2 items. Notably, the average STM capacity when using American Sign Language (ASL) rather than English is only 5 ± 1 items. Here we show that, contrary to previous interpretations, this difference cannot be attributed to phonological factors, item duration or reduced memory abilities in deaf people. We also show that, despite this difference in STM span, hearing speakers and deaf ASL users have comparable working memory resources during language use, indicating similar abilities to maintain and manipulate linguistic information. The shorter STM span in ASL users therefore confirms the view that the spoken span of 7 ± 2 is an exception, probably owing to the reliance of speakers on auditory-based rather than visually based representations in linguistic STM, and calls for adjustments in the norms used with deaf individuals.
Working memory refers to the capacity-limited ability to maintain and manipulate information relevant to an ongoing task. Over the years, a large number of studies have focused on the part of working memory dedicated to short-term maintenance of information, which is known as STM. As reports have documented a significant relationship between the size of the STM capacity for linguistic material and language abilities (for review, see refs. 1,2), much work has been conducted to uncover the mechanisms underlying the STM capacity limit. One of the most common measures of capacity limits in STM is the digit span task3, where subjects must repeat lists of digits in the same order as they are presented (i.e., forward serial recall). The number of digits to be recalled is progressively increased, and the STM span is defined as the longest sequence reported correctly. As noted in a seminal study in 1956 by Miller4, our ability to process information in such short-term memory tasks has a capacity limit of seven plus or minus two items. The ‘magical number’ of 7 ± 2 has been widely confirmed as the capacity limit in STM since this early work.
However, the view that 7 ± 2 is the standard capacity of STM has been recently questioned. When non-nameable materials are used, the span of STM drops to four or five items5 (for review, see ref. 6). It has been proposed that a STM span of 7 ± 2 is the exception rather than the rule. One hypothesis is that the exceptionally high STM span of 7 ± 2 is specific to linguistic material and derives from the ability of humans to chunk linguistic information6. An alternative possibility, however, is that the exceptionally high STM span of 7 ± 2 might be an effect of modality, arising from greater STM capacity for encoding serial information in auditory STM as compared to visual STM. In the present studies, we examined STM span in native users of ASL, which offers a unique opportunity to separate the contributions of language versus modality to STM capacity.
ASL, the natural gestural language used by deaf people in the United States and parts of Canada, has all the linguistic properties of other natural languages7,8. ASL possesses a ‘phonology’, morphology and syntax that are as complex as those present in spoken natural languages7,8. Phonology in ASL refers to the fact that signs are composed of independent visual-gestural features of hand shape and palm orientation, location in space and motion (analogous to features like voicing, manner and place of articulation in spoken languages). Importantly, the perception and encoding of signs in STM rely on these phonological features, as is the case for spoken words. In the case of speech, STM mechanisms have been best described by the phonological loop model of Baddeley2,9. In this model, spoken items are encoded in STM based on their phonological properties (i.e., as they sound). Accordingly, spoken serial recall is more limited for words that sound alike, as their encoded traces are similar and thus confusable, an effect termed the phonological similarity effect. Once encoded, traces are assumed to decay steadily unless rehearsed through a subarticulatory mechanism. The idea of decay and rehearsal mechanisms in spoken STM is supported by a reduced span when the words to be recalled take longer to produce (the word length effect) as well as when concurrent articulation is required (the articulatory suppression effect). For detailed reviews on the phonological loop model, see refs. 2,9. Recent evidence also shows that phonological complexity, or the complexity of the articulatory plan necessary to pronounce the sounds forming the target words, affects the length of the span10–12 (for review, see ref. 13).
The few available studies of memory in adult native users of ASL indicate that similar mechanisms are at play in ASL STM and spoken STM. Indeed, serial recall in signers is affected by signed phonological similarity14–16, sign length17 and manual articulatory suppression16,18,19. These results indicate that, like speakers, native ASL signers rely on phonological encoding in signed STM, and use a subarticulatory (manual) mechanism to rehearse signs in STM (for review, see ref. 20). However, in contrast to English, previous studies of ASL span report a signed span of only 4–5 signs21–24. The shorter signed STM serial span for ASL has often been attributed to the longer item duration for signs than for speech7,19–22,24. This explanation is based on evidence showing that, across spoken languages, the slower the pronunciation rate (and thus the longer the item duration), the shorter the spoken STM serial span.
The goal of the present studies was to determine the capacity of STM in native ASL signers while controlling for the phonological and pronunciation factors known to affect STM span measures in spoken languages. As reviewed above, both phonological properties (similarity and complexity) and articulation duration of the target linguistic information determine the capacity limit of spoken STM (for review, see ref. 25). Given that similar mechanisms seem to underlie serial recall in both spoken and signed STM, at least three factors could explain the shorter signed STM span previously reported21–24. First, the difference might be due to a greater phonological complexity in signs. Second, the shorter signed STM span might also be due to greater phonological similarity in the particular signs used in previous studies. Third, given evidence showing that signs require longer to articulate than English words26, the shorter signed STM span might be due to longer sign duration. Under the hypothesis that the spoken English STM span of 7 ± 2 items is due to an advantage for linguistic information6, one would expect to see an increased STM span in ASL, once signed phonological properties and sign duration are controlled for.
In experiment 1, we investigated whether controlling for phonological properties (complexity and similarity) as well as for articulation duration would close the gap between the size of the signed STM span in deaf signers and that of the spoken STM span in hearing speakers. As in previous research, the English materials consisted of lists of digits from 1 to 9, as these digits are phonologically dissimilar and of very low phonological complexity. To match these properties in ASL, we used a set of ASL finger-spelled letters that, like digits, are phonologically simple and highly familiar to signers. Furthermore, and unlike signed digits, a subset of ASL letters that have little phonological similarity can be easily selected.
Although much debate exists regarding measures of articulation duration and rehearsal rate in STM (for review, see ref. 25), speeded reading rate has been most commonly used as a measure of rehearsal rate in STM24,27,28. To control for item duration and rehearsal rate across speech and sign, we asked deaf native ASL signers and hearing English controls to read a list of 200 items (digits read aloud for speakers, letters signed for signers) at the fastest pace they could while still articulating all of the items clearly. There was no significant difference between the mean speeded reading rate for speakers and that for signers (2.9 items/s for both groups, F1,22 < 1, ω2 = 0.00), indicating that similar articulation duration was present for ASL letters in signers and English digits in speakers.
The same participants were tested on the STM span task. Native signers viewed a videotape of a native signer producing short sequences of letters at a fast and natural ASL presentation rate. Native speakers were presented with a videotape of a native English speaker producing digit sequences at a rate of presentation similar to that in the ASL videotapes. Results showed that the span of native signers varied between 3 and 6, with a mean of 4.4 (standard error, s.e.m. = 0.26) items. In contrast, and as expected, the span of native speakers varied between 4 and 9, with a mean of 7.2 (s.e.m. = 0.46) items. A one-way analysis of variance (ANOVA) comparing performance in the two groups confirmed that these spans are significantly different from each other (F1,22 = 27.29, P < .001, ω2 = 0.52). Thus, even though the materials used in each language were phonologically simple and phonologically dissimilar, and led to similar articulation duration, deaf signers still showed a significantly shorter STM span than hearing speakers (Fig. 1).
Experiment 1, as well as earlier studies aiming to assess capacity limits in signed STM, included only deaf signers. It is therefore possible that the shorter signed STM span previously reported reflects reduced memory abilities in deaf individuals. Experiment 2 controlled for this possibility. A group of 20 adult, deaf, native signers and a group of 20 adult, hearing, native ASL/English bilinguals were tested in the ASL span task. In addition, to verify that native speakers would present the expected English span of 7 ± 2 items, the 20 ASL/English bilinguals, as well as 20 hearing monolingual controls, were also tested on the standard English digit span task3.
Materials similar to those in experiment 1 were used, but as the relatively fast rate of presentation used in experiment 1 is not the standard rate in the STM literature, new lists were videotaped using the standard STM rate of presentation of 1 item/s3. As in experiment 1, the stimuli used in experiment 2 were phonologically simple and dissimilar for both signed and spoken stimuli. Item duration was controlled for by measuring recall rate or the number of items enunciated per second during the recall phase of the STM task. Although this method is likely to underestimate the articulation time used during rehearsal, it has the advantage over the speeded reading measure used in experiment 1 of measuring articulatory duration while participants are actually performing the short-term memory task25. The recall rate was significantly faster for ASL deaf signers (mean ± s.e.m. = 3.52 ± 0.24 items/s) than for English speakers (2.56 ± 0.15 items/s; F1,38 = 11.56; P < 0.002, ω2 = 0.21). Similarly, among bilinguals, the recall rate tended to be faster for signs (2.9 ± 0.18 items/s) than for English digits (2.55 ± 0.1 items/s; F1,19 = 3.47, P < 0.08, ω2 = 0.11), establishing that articulation duration for ASL letters in signers is similar to that for English digits in speakers, if not faster.
Despite this fact, the signed STM span in all ASL conditions was significantly smaller than the spoken STM span measured by the English digit span. This was the case not only between deaf signers and the hearing controls (deaf = 4.85; hearing = 6.4; F1,38 = 14.4; P < 0.001; ω2 = 0.25), but also within the hearing native ASL/English bilinguals tested in ASL versus English (ASL = 5.2; English = 7.05; F1,19 = 37.6; P < 0.001; ω2 = 0.57). Thus, for the same individual, we observed a span of about seven items when tested in English, but a span of about five items when tested in ASL.
This result establishes that the shorter ASL span cannot be attributed to reduced memory capacity in deaf signers. Rather, it is the use of a sign language that underlies the difference in span noted earlier between hearing speakers and deaf signers. The ASL STM span was, in all cases, significantly smaller than the spoken STM serial span of 7 ± 2 observed in English speakers. This difference in span was observed despite the use of phonologically dissimilar and simple signs and, if anything, a faster recall rate in ASL than in English (Fig. 2), ruling out an interpretation of the language difference in terms of slower articulation during rehearsal in ASL.
Experiments 1 and 2 show that in the case of a serial STM span task, spoken items lead to a longer span than signed items. In experiment 3, we asked whether the sizeable language effect observed with an STM serial span task would also be observed when a working memory span task was used. Importantly, whereas STM span tasks require active maintenance of items in a specific serial order, working memory span tasks require on-line manipulation of linguistic information rather than maintenance of serial order. Measures of working memory have been proposed to be better predictors of language skills than measures of STM serial recall29–31. Indeed, successful linguistic processing (as, for instance, in language comprehension and production) critically depends on on-line manipulation of the relevant linguistic information. Thus, although STM span tasks are routinely used for assessment in clinical and educational settings, spoken and signed STM span tasks might not be optimal measures of capacity limits in linguistic working memory, where the conjunction of active maintenance and on-line manipulation of linguistic information is needed. The outcome of experiment 3 has obvious practical implications for the deaf community, and also provides a test ground for the impact of serial order information on short-term memory processes across modalities.
Similarly to our earlier experiments, deaf native ASL signers were tested using ASL for stimulus presentation and recall, and their performance was compared to a control group of native English speakers tested in English. In both groups, 18 participants performed first an STM span task and then a working memory task. The latter task was inspired by the speaking span task, which was designed to assesses working memory resources in language production32. On each trial, participants were presented with a list of words and asked to recall each of the presented words in a separate, self-generated sentence. For example, given the list “voice, airplane,” a correct response would be “The boy does not use his voice; The airplane arrived late.” Importantly, recall of the order of target words is not required. Hence in our example, the response “He saw an airplane in the sky; She has a pretty voice” would also be correct. For native English speakers, the working memory span measured through the speaking span task is about 3 ± 1 items (mean = 3.15) (ref. 32). Importantly, similar speaking spans have been reported for native speakers of different spoken languages32,33. In contrast, STM spans vary with phonological complexity and word length, and thus differ across languages. Native speakers of languages in which digit names are shorter to enunciate, such as Chinese, tend to have longer STM spans34, whereas speakers of languages with longer digit names, such as Welsh, show shorter STM spans27. Thus, unlike STM span measures, the working memory span measure we propose to use seems to provide a cross-linguistically stable assessment of capacity limits in working memory, independent of the idiosyncratic properties of different languages.
We therefore constructed an ASL signing span and an English speaking span to investigate the impact of language modality on working memory resources. If the use of sign language leads to poor short-term memory for linguistic tasks, the same population difference should be observed with the working memory task as with the STM span task. Alternatively, the structural and functional similarities of working memory across natural languages predict that a similar working memory capacity limit of about 3 ± 1 items should be observed for both speakers and signers tested in their respective native languages.
As in the previous experiments, a sizeable difference between deaf signers and hearing speakers was observed on the STM span task (deaf, mean ± s.e.m. = 5.5 ± 0.2; hearing, 8.06 ± 0.17; F1,34 = 93; P < 0.0001; ω2 = 0.72; Fig. 3a). In contrast, the ASL and English working memory spans did not differ significantly from one another (deaf, mean ± s.e.m. = 2.94 ± 0.1; hearing, 3.22 ± 0.18; F1,34 = 1.81; P > 0.18; ω2 = 0.02; Fig. 3b). Thus, the sizeable discrepancy observed in signers’ versus speakers’ memory capacities as revealed by an STM span task was not found when a working memory test was used. Native ASL signers and native English speakers show quite similar working memory resources for the maintenance and on-line manipulation of linguistic information in language production, despite a lower capacity in signers to maintain signed information in serial order.
The ASL STM span of native signers, deaf or hearing, was never close to the ‘magical number 7 ± 2′ consistently observed as the STM span for spoken information. This was the case despite the extreme phonological simplicity of the ASL finger-spelled letters used. In addition, the shorter ASL span was observed even when pronunciation rate was equivalent or even faster in ASL than in English. Thus, our findings indicate that the shorter ASL STM span cannot be explained by the phonological properties of signs, by the presentation or recall duration of signs as compared to speech, or by reduced memory capacities in deaf individuals.
Most importantly, in contrast with the hypothesis that exceptionally high STM spans derive from an advantage of storage of linguistic (over nonlinguistic) information in STM6, the present results show that the number 7 ± 2 is specific to a serial STM span task in which the information is encoded in an auditory representation. The significant differences between spoken and signed STM spans reported here thus indicate that the exceptionally high linguistic STM span in speakers might be due to a modality effect, rather than a linguistic effect, in STM. We suggest two possible mechanisms by which the storage of spoken information might differ from that of signed information in the context of serial recall from STM. First, in the phonological loop model, the phonological store is assumed to build upon earlier sensory memory stores. Information encoded in these stores is known to decay over time, and it is possible that speech-like information decays at a slower rate than visually encoded information. Certainly, at the level of the primary sensory stores, echoic memory lasts for 2–4 s (ref. 35), whereas iconic memory, on which ASL encoding is likely to depend, only lasts at most 1 s (ref. 36). As a result, the time over which an item could be maintained without rehearsal might be much longer for words that have been encoded into speech than for signed items. This difference might, at least in part, explain the exceptionally high STM span for linguistic items in speakers as compared with all other (visual) materials tested in the literature (for review, see ref. 6).
A second possibility is that the longer STM span in speakers might be due to differences in the retention of serial order information across modalities37,38. Assessments of capacity limits in STM are traditionally conducted using forward serial recall tasks (such as the digit span task), in which correct recall requires producing items in the order of presentation. The requirement of serial order recall is likely to benefit materials that are encoded in an auditory as opposed to a visual format. The auditory system is known to be highly efficient in retaining the order of occurrence of sounds39. In contrast, the visual system seems to be more limited in its ability to retain temporal order information, but is much more efficient in retaining other types of information, such as spatial structure. Building on a similar argument, others have proposed that signers and speakers may encode order information in quite different manners; speakers rely predominantly on temporal encoding and signers predominantly on spatial encoding13. Thus the difference in STM span between speakers and signers might arise, at least in part, from the STM span task requirement of recalling items in serial order, combined with the use of stimuli with a clear temporal pattern but little, if any, spatial patterning. The proposal that serial order recall drives the STM span difference across languages is further supported by reports that speakers and signers show similar STM performance in tasks that do not require ordered recall, such as the signing span task presented in experiment 3 or free recall tasks of linguistic information40,41.
The finding that, despite signers’ shorter STM span, similar working memory resources are present in native signers and speakers indicates similar abilities in tasks for which a conjunction of active maintenance and on-line manipulation of linguistic information is required. Thus, the shorter STM span in deaf and hearing native signers does not have a direct influence on working memory, or more generally on the language skills of signers. This pattern of results highlights the importance of using cross-linguistically stable measures when comparing memory capacity limits in native users of different languages. In that respect, working memory measures are more advantageous than standard STM span measures30. Unfortunately, STM span tasks are currently the most commonly used measures of working memory for applied purposes, such as clinical evaluation and educational testing. Our findings suggest that there should be adjustments in the norms applied in evaluative procedures where STM measures requiring serial recall are used. For instance, such norm adjustments are currently needed in the linguistic assessment of deaf patients in clinical settings, as well as in standardized evaluations, such in testing using the Wechsler Adult Intelligence Scale (WAIS)3,42. More generally, it is important to recognize that short-term and working memory may operate in somewhat different ways across encoding modalities, and may therefore support equally useful linguistic and nonlinguistic processing in different fashions across the auditory and visual modes.
Fifty congenitally deaf, native ASL signers were recruited from the Rochester, New York area and from Gallaudet University (Washington, DC). All deaf signers were exposed to ASL from birth by their deaf parents, considered ASL as their primary language and used ASL daily (see Table 1 and Supplementary Methods online for details). Fifty hearing native English speakers (unfamiliar with ASL) were recruited from the Rochester, New York area. Twenty hearing native ASL/English bilinguals were recruited among hearing children of deaf adults, with the constraint that they had never been trained in interpreting (Table 1). Written informed consent was obtained from all participants.
Stimuli were displayed on a Macintosh PowerBook G3 (monitor size, 14 inches), using PsyScope software43. The Psychophysics Toolbox in Matlab44,45 (The MathWorks) was used to present the STM stimuli in experiment 3. Participants’ performances were videotaped using a Sony TVR-900 DV camera. All stimuli were videotaped and presented as short movies.
Nine ASL letters, chosen to maximize phonological dissimilarity (B, C, D, F, G, K, L, N and S), were used to create 16 meaningless sequences (2–9 items, 2 sequences of each length). To create the ASL stimuli, we videotaped a deaf native ASL signer (TS) while he finger-spelled each sequence using a natural ASL smooth prosody (i.e., with minimal transitions and yet no coarticulation between letters; mean rate = 3.6 items/s; s.e.m. = 0.13).
A native English speaker was videotaped while producing the digit sequences of the WAIS digit span3 at the same rate as used for ASL (mean ± s.e.m. = 3.2 ± 0.05 items/s). The English presentation rate did not differ from the ASL rate reported above (F1,30 = 3.22, P > 0.05).
Participants viewed a movie instructing them to recall the sequences in the same order as presented (in ASL for signers and in English for speakers). Signers were given two practice trials, and were instructed to indicate forgotten items by signing “BLANK” in place of the forgotten item. Hearing participants received the instructions of the WAIS digit span3.
All participants were first exposed to two trials, each of which consisted of a sequence of two items. Then, the sequence length was progressively increased. Testing ended when the participant produced inaccurate recalls for both trials at a given sequence length. Thus, as in the WAIS digit span3, the ASL and English spans were defined as the longest list length at which a correct serial recall was observed.
The ASL letter set described above in the STM span task was used to create a page with 20 lines, each containing 10 letters in random order. For English stimuli, we constructed a page with 20 lines, each containing 10 digits in random order.
Participants were instructed to read the list aloud (in English for speakers and in ASL for signers) as fast as possible, but clearly. The mean articulation rate was computed for each participant.
ASL stimuli were created as in experiment 1, at an average presentation rate of 1.2 items/s (s.e.m. = 0.01). For English stimuli, the WAIS digit sequences3 were enunciated by a native English speaker at an average rate of 1.1 items/s (s.e.m. = 0.04).
The procedure was the same as in experiment 1. Each deaf and hearing signer was tested on the ASL STM span task. The English digit span was administered to the ASL/English native bilinguals and to the hearing speakers. For each participant, the mean recall rate was computed based on their correct recall performances during the STM task.
The ASL digits (1 through 9) were used to create 16 sequences (2–9 items, 2 sequences per length). A native signer was videotaped signing each sequence at a rate of 1 item/s. Each ASL sequence described above was spoken by a native English speaker at a rate of 1 item/s and videotaped.
Eighty-one one-handed ASL noun signs were selected based on their frequency of use and their phonological complexity (see Supplementary Methods for details). Eighteen sequences were created (2–7 noun signs, 3 sequences per length). Each sequence was signed by a native ASL signer at a rate of 1 item/s and videotaped. For the spoken English working memory task, the 18 ASL sequences were translated into English word sequences. As in the original speaking span, the English sequences were presented in writing on a computer screen (1 noun/s; interstimulus interval = 10 ms)32.
Participants were all tested in the speeded reading task first (no difference between deaf signers, 3.31 items/s, and hearing speakers, 3.29 items/s), the STM span task second, and the working memory task third. In the working memory task, participants were instructed to (i) remember each presented noun and (ii) recall each noun in a separate self-generated sentence. Deaf signers received the instructions in ASL through a movie; speakers received written instructions. Each participant completed three trials that contained sequences of two items, then three trials with sequences of three items, and so on, until all sequences were presented.
The working memory spans were computed starting with the recall of two-item sequences (working memory span = 2). For each sequence length with correct recall of all items in at least two out of three trials, one point was added to the working memory span score. The scoring procedure was terminated at the first sequence length at which fewer than two trials were correct. When a subject gave an accurate response in one of the three trials at the last list length considered, 0.5 was added to the final span score. Correct recall required that (i) the target noun sign was recalled in a sentence and (ii) the sentence was syntactically and semantically well formed. The target noun sign was not always exactly recalled as presented. Recall of a word with a similar surface but a different syntactic role was scored as correct (e.g.,‘dangerous’ instead of ‘danger’ in English or ‘fly’ instead of ‘airplane’ in ASL).
Analyses of variance (ANOVAs) were performed for all reported comparisons. The effect size (ω2) is reported for all comparisons between ASL and English. Each effect size was computed by ω2 = (SSeffect − (k − 1) MSresidual)/(SStotal + MSresidual), where k is the number of level of the effect. Whereas the significance of F values is affected by small sample sizes (and hence by low power), ω2 is unaffected by variations in sample sizes46. Thus, ω2 provides a reliable estimate of effect size in the present experiments, where small sample sizes were due to the small size of the population of native ASL signers.
We wish to thank the students and the staff of the National Technical Institute for the Deaf, Rochester, New York, and of Gallaudet University, Washington, DC, as well as the participants and organizing committee of CoDAWay 2002. We are also grateful to P. Clark and P. Hauser for their support and to M. Hall, O. Pouliot, J. Cohen, D. Metlay and R. Harris for their assistance. This research was supported by the National Institutes of Health (DC04418 to D.B.; DC00167 to E.L.N. and T.S.) and by the James S. McDonnell Foundation (D.B.).
COMPETING INTERESTS STATEMENT
The authors declare that they have no competing financial interests.
Note: Supplementary information is available on the Nature Neuroscience website.