|Home | About | Journals | Submit | Contact Us | Français|
This work focuses on the twenty-six individuals who provided data to AphasiaBank on at least two occasions, with initial testing between 6 months and 5.8 years post-onset of aphasia. The data are archival in nature and were collected from the extensive database of aphasic discourse in AphasiaBank.
The aim is to furnish data on the nature of long-term changes in both the impairment of aphasia as measured by the Western Aphasia Battery-Revised (WAB-R) and its expression in spoken discourse.
AphasiaBank’s demographic database was searched to discover all individuals who were tested twice at an interval of at least a year with either: 1) the AphasiaBank protocol; or 2) the AphasiaBank protocol at first testing, and the Famous People Protocol (FPP) at second testing. The Famous People Protocol is a measure developed to assess the communication strategies of individuals whose spoken language limitations preclude full participation in the AphasiaBank protocol. The 26 people with aphasia (PWA) who were identified had completed formal speech therapy before being seen for AphasiaBank. However, all were participants in aphasia centers where at least three hours of planned activities were available, in most cases, twice weekly. WAB-R Aphasia Quotient scores (AQ) were examined, and in those cases where AQ scores improved, changes were assessed on a number of measures from the AphasiaBank discourse protocol.
Sixteen individuals demonstrated improved WAB-R AQ scores, defined as positive AQ change scores greater than the WAB-R AQ standard error of the mean (WAB-SEM); seven maintained their original WAB quotients, defined as AQ change scores that were not greater than the WAB-SEM; and the final three showed negative WAB-R change scores, defined as a negative WAB-R AQ change score greater than the WAB-SEM. Concurrent changes on several AphasiaBank tasks were also found, suggesting that the WAB-R improvements were noted in more natural discourse as well.
These data are surprising, since conventional wisdom suggests that spontaneous improvement in language is unlikely to occur beyond one year. Long-term improvement or maintenance of early test scores, such as that shown here, has seldom been demonstrated in the absence of formal treatment. Speculations about why these PWA improved, maintained or declined in their scores are considered.
The conventional wisdom in aphasia rehabilitation is that spontaneous recovery, that is, natural recovery of language function following an aphasia-producing stroke, is likely to be complete within one year (Culton, 1969; Demeurisse et al., 1980; Hagen, 1973; Holland, Greenhouse, Fromm, & Swindell, 1989; Prins, Snow, & Wagenaar, 1978; Shewan & Kertesz, 1984). In his authoritative texts, Davis (2006, 2013) supports this observation, as does Brookshire (2007). In fact, recent studies are beginning to define “long-term aphasia” as the residual aphasia after one year has elapsed (El Hachioui et al., 2013; Forkel et al., 2014). Comparatively few studies have looked at changes over longer periods of time (Fitzpatrick, Glosser, & Helm-Estabrooks, 1988; Naeser, Gaddie, Palumbo, & Stiassny-Eder, 1990; Naeser et al., 1998; Wade, Hewer, David, & Enderby, 1986).
With the exception of the extensive Copenhagen study (Pederson, Vinter, & Olson, 2004), which relied on data gathered in the 1990’s, the dearth of subsequent long-term studies of aphasia without formal treatment in its many guises suggests that this aspect of recovery was essentially a closed issue before the onset of the 21st century. A comprehensive review of prognostic factors by Plowman, Brecken, and Ellis (2012) concludes that “… while patient related variables (age, gender, handedness, education, and intelligence) do not appear to significantly influence recovery patterns, stroke-related variables such as initial stroke and aphasia impairment, lesion size and lesion location, do influence recovery patterns” (p. 5).
Only a few clinical reports have documented changes in individuals who have been followed for years. Jungblut, Suchanek, and Gerhard (2009), for example, reported on a person with aphasia (PWA) who had music therapy following his aphasia for a number of years, with continued improvement. Berthier and Pulvermuller (2011) described language improvement in a chronic PWA following intensive massed-practice therapy. Holland (1999) and Holland and Ramage (2004) reported on RR, a person with aphasia who was three years post-onset when they first saw him. RR improved substantially on a number of treatment-relevant pre- and post-test measures of word retrieval over the next six years. Smania et al. (2010) reported the case of a left hemisphere stroke patient who improved in naming and repetition for three years after the stroke, with spontaneous speech emerging thereafter. Only two studies with larger sample sizes have discussed long term changes in aphasia. Aftonomos, Steele, and Wertz (1997) reported that “the majority of 23 individuals” with chronic aphasia (11 months to 15 years post-onset) improved significantly on formal testing (13 tested with Porch Index of Communicative Ability, 10 with the Boston Naming Test, the WAB, and the Boston Diagnostic Aphasia Exam) following their involvement with the interactive technology, Lingraphica. Naeser et al. (1998) documented lesion expansion in 12 patients at five years post-stroke, but noted that these expansions had “no effect on language, and in fact, some improvement in language may occur” (p. 1). Thus, only limited research on long-term change in aphasia post-stroke is currently available.
Our notions of brain plasticity have changed over the past few decades. We have become aware that not only pre-pubescent brains, but also older human brains, including the brains of PWA, can continue to adapt and change far into the lifespan (Papathanasiou, Coppens, & Ansaldo, 2011; Raymer et al., 2008). How does that affect our current beliefs about recovery from aphasia over time? It is undeniably difficult for researchers to continue to follow most PWA for more than one year. A number of factors intervene, including simple aging and changing lifestyles, limitations of medical benefits to support aphasia intervention, fatigue with the therapeutic process and thus limited contact with service providers, and countless other personal reasons for dropping out of formal studies. However, the growing availability of aphasia centers and community programs for persons with chronic aphasia has presented an opportunity to investigate how, if at all, the impairment of aphasia might change. Community programs, if they are of benefit, should help members find ways to deal with the consequences of aphasia and should assist them in discovering how to participate more fully in life again. But the effects of these centers on improving the aphasic impairment itself have not been considered.
The NIH-funded American archive AphasiaBank (MacWhinney, Fromm, Forbes, & Holland, 2011), has found aphasia centers to be a providential source of PWA willing to provide samples of their discourse for researchers interested in studying the language of aphasia. AphasiaBank is a shared database of multimedia interactions for the study of communication in aphasia. Currently, the database includes almost 400 media files linked with transcripts for PWA and almost 200 for adults without aphasia or any other neurological impairment. These transcripts include a variety of discourse samples gathered according to a standard discourse protocol.
Because AphasiaBank researchers are interested in whether and how discourse changes over time, some PWA have provided multiple samples of their discourse over time at their respective aphasia centers or community programs. In addition to collecting discourse samples, AphasiaBank collects extensive demographic data and test results using a variety of formal and informal measures. However, the primary focus of this paper is the aphasic impairment, as measured by the Western Aphasia Battery-Revised (WAB-R) (Kertesz, 2006). Do these individuals with chronic aphasia maintain their initial test scores, decline, or improve in their aphasic impairments as measured by WAB-R? Secondarily, how, if at all, are these changes manifested in discourse?
This study focuses on 26 chronic PWA who provided speech and language samples to the AphasiaBank project. All procedures were approved by the Internal Review Board at Carnegie Mellon University. Informed consent was obtained from all participants included in the study. Sixteen participated in the entire AphasiaBank protocol (standard discourse tasks, formal and informal tests) on more than one occasion at intervals of one year or more. The other ten participated in the AphasiaBank protocol at the first visit and were retested with the WAB-R and the FPP only (no discourse protocol) at a subsequent visit at least one year after the first visit. The group consisted of 19 males and 7 females with a mean age of 60.4 years (range = 36–90.7 years) and a mean time post-onset of 5.5 years (range = 0.5–15.3 years) at first testing. At final testing the group ranged from 1.9 to 19.35 years post-onset.
Qualitative researchers might describe this study as “serendipitous research” (cf., Roberts, 1989)1. It resulted from the attempt of the first author to find some clues in the demographic database to help to explain long-term attendance at community programs and centers by individuals who had been seen by AphasiaBank more than once, and in the interim had continued to participate in the activities of their aphasia programs.2 The 16 who repeated the entire protocol volunteered to do so when offered the opportunity to be retested. The 10 who agreed to be tested with the FPP all were encouraged to do so by clinicians at the centers because their extensive oral language difficulties severely limited their participation in the more language-oriented protocol. In all cases, the WAB-R was readministered.
The AphasiaBank protocol includes personal narratives, picture descriptions, a procedural discourse task, and a Cinderella story retelling task (MacWhinney, Fromm, Forbes, & Holland, 2011). In addition to the WAB-R AQ subtests, formal testing includes the short form of the Boston Naming Test-Second Edition (BNT) (Kaplan, Goodglass, & Weintraub, 2001) and the Verb Naming Test from the Northwestern Assessment of Verbs and Sentences-Revised (VNT) (Cho-Reyes & Thompson, 2012). For both the AphasiaBank discourse protocol and the formal tests that accompany it, strict guidelines for data gathering are used, and the guidelines for WAB-R, BNT, and VNT administration and scoring are followed. Three certified speech-language pathologists with a mean of 20 years experience with the WAB administered these tests, and at the time of repeat testing, were unaware of the scores originally obtained. PWA came from five different sites in the United States. All of these sites have adopted the Life Participation Approach to Aphasia (LPAA) (LPAA Project Group, 2000), aiming to offer the support and training PWA need to realize their goals of participating in daily life as fully as possible. The centers offer a variety of group activities. Some may teach particular skills, such as compensatory language or computer skills, some allow members to pursue their individual interests, such as gardening, art, and reading, and all encourage conversation on topics such as news and sports. Although some center members attend only once weekly, most are involved in two or three days of programming.
Both the discourse protocol and the FPP were videotaped. The protocol language samples were transcribed by experienced transcribers using the CHAT format, and all analyses were done using CLAN (MacWhinney, 2000). Following the guidelines of Berndt, Wayland, Rochon, Saffran, and Schwartz (2000), utterances were segmented based on the following hierarchy of indices: syntax, intonation, pause, semantics. Two transcribers reviewed each transcription and reached forced choice agreement on any discrepancies. Word errors were coded using the error coding system described at the AphasiaBank website (http://talkbank.org/AphasiaBank/). For statistical tests, the significance level was 0.05.
Demographic data and WAB-R data (AQ score and aphasia type) for the whole group appear in Tables 1 and and22 in the column labeled “Overall (n=26)”. The mean time post-onset for this group’s first stroke at first testing was 5.5 years, far beyond the time when spontaneous recovery should be occurring. Summary statistics for WAB-R AQ results were calculated for the whole group using two-tailed paired t-tests. Results revealed significant improvement from first to last testing for the overall WAB-RAQ as well as for the Spontaneous Speech and Repetition subtests. No statistically significant correlations were found between age and AQ change score (r = 0.–2) or time post-onset and change score (r = −0.27) for the whole group.
To better understand the nature of the improvement, the 26 participants were divided into subgroups based on changes in their WAB-R AQ scores that were greater than or less than the WAB-R AQ standard error of the mean (WAB-SEM), 2.5 points.3 For the current study, we rounded up and used +/− 3 points on the WAB-R AQ as our metric for assigning participants to the “improving”, “maintaining”, or “declining” groups. This +/− 3-point metric for the AQ is substantially higher than the 0.12 mean test-retest reliability reported in the WAB-R manual for 35 PWA with stable, chronic aphasia who were tested first at an average of 2.05 years post-onset and again at an average of 3.91 years post-onset. The clinical relevance of a +/− 3-point change in WAB-R AQ score is difficult to ascertain but can be supported by considering a number of examples. In the Spontaneous Speech section, if a person earns one more point in Information Content (by responding to one more item in the Conversational Questions or mentioning more details in the Picture Description) and scores one point higher in the Fluency rating (by improving fluency, grammatical competence and/or occurrence of paraphasias), the WAB AQ will increase by 4 points (assuming all other scores remain the same). These changes, which reflect more relevant substance and fluency in output, would suggest clinical improvement. Bigger performance changes are necessary in the other subtest areas to reach a 3-point change in WAB-R AQ. For example, the Repetition or Naming subtest scores would need to increase by 15 points, the Auditory Comprehension subtest score would have to increase by 30 points, or some combination of the above net increases would need to occur. These changes are not minor and would also appear to represent clinically relevant changes comfortably outside the test-retest measure.
Figure 1 is a graphic display showing that almost two thirds of the PWA group improved and almost nine tenths of the group either improved or maintained their language abilities on the WAB-R subtests. Figure 2 shows each individual’s WAB-R AQ change score. Visually, across the distribution of change scores, clear demarcations between the groups are apparent, with relatively moderate change scores in the decliners (in red) and reasonably robust change scores in many of the improvers (in green). Results of a Mann-Whitney U-test confirmed that the WAB-R AQ change scores for the improving and maintaining groups were significantly different (p = 0.0001) from each other.
As indicated previously, Tables 1 and and22 provide demographic data and WAB-R (AQ scores and aphasia types) for the three subgroups. It is not apparent in Table 1, but of the three PWA whose WAB-R AQ scores deteriorated, two were above the mean initial age of both the improvers group and the maintainers group. The improvers had lower initial WAB-R AQ scores than the other two subgroups, while their WAB-R AQ scores approximated those of the maintainers on final testing.
Overall WAB-R AQ scores changed significantly for the improvers from first to last testing. Statistical analyses for the improvers were conducted with one-tailed tests because it was already evident that these individuals demonstraed an overall improvement; thus, the goal of the other statistical tests was to determine which of the WAB-R subtests were most clearly responsible for the improvement. To identify the source of improvement, the WAB-R AQ’s language domain subtest scores from first and last testing were compared for the 16 improvers. Table 3a shows their scores and paired t-tests results. All subtest scores improved significantly. The largest changes were seen in the Spontaneous Speech subtest score, reflecting increased information content and improved fluency. Changes on this subtest, arguably the WAB-R’s most qualitative, indicate observable changes in the accuracy and fluency of oral question-answering and picture description. Subtest scores from first and last testing for the other groups were also examined and are summarized in Tables 3b and and3c.3c. For the seven maintainers, whose WAB-R AQ scores were essentially unchanged, no significant differences were found for any of the four subtest summary scores. For the three participants in the group whose scores declined, no statistical comparison was possible. The mean score for Repetition increased from 63.3 at first testing to 68 at last testing, with each of the three participants making slight improvements on that subtest. The improvement in Repetition was offset by decreases for all three participants in Spontaneous Speech, Auditory Comprehension, and Naming.
Although all 16 individuals who improved on the WAB-R were tested and retested with the WAB-R, only 11 of them received the full AphasiaBank protocol on more than one occasion: one did not complete the discourse tasks at first testing and the remaining four individuals received the FPP at their next AphasiaBank visit instead of the discourse tasks. In this secondary analysis, the language transcripts from the discourse tasks and the test results for these 11 PWA were examined to determine whether improved scores on the WAB-R were reflected on other AphasiaBank protocol tasks. (Comparisons for those whose WAB-R scores declined or were maintained were not possible because only two of the decliners and three of the maintainers completed the discourse tasks at both test times, and only two decliners and four maintainers completed the BNT and VNT.) We looked at the following measures:
Table 4 shows the retest data that yielded significant results. One-tailed paired t-tests showed significant improvement on both the BNT and VNT as well as MLU for free speech samples, possibly reflected in the fluency subtest of the WAB-R. TTR for the Cinderella story retell decreased significantly, which may be due to the larger number of words used in the final Cinderella retellings. At last testing, the mean number of total words increased from 120.6 to 160.3 for the Cinderella retelling, from 209.6 to 246.1 words for the Free Speech tasks, and decreased from 145.5 to 142.1 for the Picture tasks. None of the other measures showed significant changes from first to last testing, probably due to the small sample size and extensive variability.
All 11 participants who improved on the WAB-R showed changes in their discourse. Using a criterion of greater than or equal to one standard deviation, Table 5 provides a list of the increases and decreases observed for each participant from first testing to last testing for each discourse measure and each type of discourse task. Unlike formal test scores (WAB-R, VNT, BNT), discourse measure increases may not always represent improvement. For example, an increase in total number of utterances and MLU may not be evidence of improvement for someone with Wernicke’s aphasia. Conversely, a decrease in TTR may indicate more consistent use of precise, intended lexical items. This cursory examination is intended to show the many individual changes that occur on re-administration of these discourse tasks, likely representing normal variation in discourse performance and style.
Finally, we asked clinical staff members who had contact with all of the individuals at their respective centers to identify which, if any, of the 26 participants in this study showed evidence of worsening aphasic impairment in their center involvement and interactions. Two of the three who showed WAB declines were identified as decliners by their centers. The third, the study’s oldest, died within a few months of participating for a second time. No other participants were identified as declining. As of 2015, only one other individual (an improver) had died.
This study reports significant improvements on WAB-R AQ scores in 16 of 26 individuals with chronic aphasia who were first tested an average of almost five years post-onset and retested an average of almost four years later. They also improved significantly on BNT and VNT scores, and they showed changes in several discourse measures from their language samples. The WAB-R AQ scores of seven PWA were maintained over time, and scores for the final three PWA declined. The discussion will first focus on the questions raised by this study’s largely unexpected results, followed by some possible explanations. These results call into question the almost universal belief that the natural course of aphasia recovery does not extend much beyond one year. This is a bleak prospect for individuals with aphasia and their families. Our growing awareness of brain plasticity and potential for change, advances in medicine, the gradual decrease of age at onset of stroke, and studies such as this all suggest that the question of improvement of the aphasic impairment should be revisited.
Why does change continue to occur after a year? Plowman, Hentz, and Ellis (2012), in their review of studies of aphasia recovery, found conflicting reports of the influence of age on aphasia recovery, and concluded that no clear relationship between age and recovery emerged. Statistical comparisons of demographic factors across groups was not possible in this project, but all three PWA in the declining group were older than the mean ages of the PWA improvers and maintainers. Time post-onset was not likely a factor given that the means and ranges across groups were quite similar. Only two PWA, both improvers, were initially tested by AphasiaBank at less than one year post-onset (one at 6 months and the other at 11 months post-onset). However, the improvement scores for both of these PWA were lower than the mean improvement score for the whole group of improvers. Furthermore, the individual with the longest time post-onset (15.25 years) at the study’s beginning was among the improvers. Thus, with the exception of age, this study echoes the findings of Plowman, Hentz and Ellis (2012). Like that study, this study indicates that it is unlikely that demographic factors played a significant role in explaining the changes. However, the influence of these factors on linguistic changes warrants study of a larger sample size of chronically-impaired PWA in order to make more definitive conclusions.
Those who maintained their language abilities throughout their participation in this study, without formal treatment, also represent a positive result. Although they were aging, as were the decliners and the improvers, their linguistic skills remained consistent. Why or how they maintained their level of impairment is a question that also requires further study.
The improvers and maintainers present an important contrast to what is described in the literature for people with Primary Progressive Aphasia (PPA), where symptoms that resemble various types of stroke-induced aphasia (e.g., Broca’s, Anomic, Conduction) develop slowly over the course of a year or two and then continue to increase in severity over time (Gorno-Tempini, Dronkers, et al., 2004; Grossman & Ash, 2004; Wilson et al., 2010). It is indeed possible that even those who declined in this study did so in a way that differs from that seen in PPA. Again, it is a question for further study.
AphasiaBank encourages the inclusion of neuroimaging data, and indeed some studies that use the whole or modified AphasiaBank protocols and methodologies (Basilakos et al., 2014; Fridriksson et al., 2012 ) incorporate imaging in their research design. However, such data, or even relevant reports of it, are seldom available from the standard files of most clinical aphasia treatment programs, including the centers that provided these data. This prevents long-term investigations such as this one from investigating the clearly important impact of variables such as initial stroke severity and lesion size and location, which were identified by Plowman, Hentz, and Ellis (2012).
The individuals studied here come from diverse backgrounds, with differing educational achievements, occupations, support systems, living circumstances, and health issues. None were engaged in formal language therapy, largely because they had run out of benefits, were not in environments where new approaches to treatment were being developed, or could not afford continued therapy from speech-language pathologists. However, they became involved with AphasiaBank because they attended community aphasia programs or centers. The centers represented here see between 40 and 60 or more individuals per week. These centers serve members across a wide age range (from early 30’s to late 80’s, with an approximate mean age of 65 years), and time post-stroke for all these program participants ranges from less than one year to over 20 years. Center members and staff have welcomed us and volunteered to help others with aphasia by being involved in research such as ours. All of the above suggests that these PWA and/or their families understand the importance of meeting others with aphasia and participating in activities that make them feel connected to others. They join with others in getting on with life and reengaging with the world to the extent they wish to, despite aphasia.
Chronic disorders such as arthritis, diabetes, and asthma have been shown to benefit from long-term, interpersonal communication-focused, cost-effective, and family and self-directed management (Lorig et al., 2000). Aphasia is often chronic, and is also likely to benefit from long-term, cost-effective, and interpersonal communication-focused (in addition to language-focused) intervention. Obviously, this interpretation is speculative and cannot prove the value of careful attention to psychosocial needs in the management of aphasia. But it does furnish substantial food for thought, and underscores the need to develop principled ways to measure their value.
Finally, this paper also recognizes and underscores the need, however difficult it is, to collect such data, and to extend the time frame for studying stroke and its consequences, particularly aphasia. Inarguably, such data will always lag behind advances in stroke management and rehabilitation. But the picture of the course of recovery, life adjustments and change may be less bleak than current data would suggest.
This work was funded by NIH-NIDCD grant R01-DC008524 (2012–2017). The authors thank the persons with aphasia who generously agreed to participate in the extensive AphasiaBank protocol and to allow AphasiaBank researchers to video the process. All were members of one of the following aphasia programs: Adler Aphasia Center, Maywood NJ; Aphasia Center of California, Oakland CA; Snyder Center for Aphasia Life Enhancement (SCALE), Baltimore MD; Stroke Comeback Center, Vienna VA; Aphasia Center of Tucson, Tucson, AZ. The authors also thank the staffs of these centers for allowing AphasiaBank researchers to seek participants from their programs. Finally, the authors thank Nina Simmons-Mackie for her careful reading and critique of this manuscript, and the reviewers for their careful attention and comments.
1Or perhaps less ostentatiously, “luck”.
2We report here only on the first and last AphasiaBank visits.
3The WAB-SEM was calculated on the basis of data provided in the WAB-R manual for the second standardization group of 215 PWA who received the AQ portion of the WAB-R. For 141 “aphasics with infarcts” (Kertesz, 2006, p. 97), the SEM is 2.52, the result of dividing the standard deviation (29.9) by the square root of the sample size (11.87, square root of 141).
Audrey Holland, University of Arizona.
Davida Fromm, Carnegie Mellon University.
Margaret Forbes, Carnegie Mellon University.
Brian MacWhinney, Carnegie Mellon University.