We believe this is the first study to show that on-line mimicking of audio-visual speech allows patients with Broca’s aphasia to increase their speech output by a factor of >2. Importantly, speech entrainment that relies only on audition does not have the same beneficial effect. This suggests that seeing the mouth of the speech model provides crucial information that allows patients to increase their own speech output. Speech entrainment that relies only on visual (and not auditory) speech yields almost no speech production, even for normal subjects. Therefore, the role of auditory speech in speech entrainment cannot be understated.
Although the primary outcome factor here was the number of different words patients could produce in a given condition, our observations suggested that speech entrainment elicited much improved speech fluency among the patients (as seen in Supplementary Video 1). Speech entrainment facilitates speech production in a way that permits patients with Broca’s aphasia to produce a greater variety of words and speak at a rate that may better approximate normal fluency compared with their spontaneous speech. This suggests that the motor commands for speech are relatively intact. Unless a mechanism that encodes the motor commands of speech is at least partially preserved, it is difficult to imagine that someone like G.B., who has moderate apraxia of speech, could produce on-line speech with such clarity. Although his speech articulation is somewhat slurred, it must be remembered that his stroke occurred more than two decades ago and that his speech mechanism has been almost quiescent since then.
Can we identify a mechanism that facilitates speech entrainment? We suggest that the neuroimaging results provide some insight. Greater cortical activity associated with spontaneous speech compared with speech entrainment was found mostly in medial regions commonly thought to support reminiscing and recall of events (Burgess et al., 2001
; LaBar and Cabeza, 2006
). The opposite contrast (speech entrainment–audio visual > spontaneous speech) revealed the greatest activation in regions that support lexical retrieval (left BA 37; Hillis et al., 2001
; Cloutman et al., 2009
) and, perhaps, a mechanism that modifies visceral functions (mostly respiration) for speech production (anterior insula/BA 47; Augustine, 1996
; Gracco et al., 2005
; Shum et al., 2011
; Tremblay et al., 2012
). Our probabilistic tractography analysis demonstrates that these regions are connected via ventral fibres and not through the arcuate fasciculus, a tract that is crucial for speech articulation (Hickok and Poeppel, 2007
; Hickok, 2012
). We suggest that this ventral network encodes the conceptual aspects of speech, at a minimum, the lexicon and aspects of homeostatic functions during speech production (Craig, 2009
), and that it is enslaved by a dorsal network involving middle temporal gyrus, superior temporal gyrus and dorsal regions of Broca’s area (connected via the arcuate fasciculus). It is a common complaint of patients with Broca’s aphasia that they know what they want to say but just cannot get the words out. Although Broca’s aphasia has been amply described elsewhere (e.g. Geschwind, 1965
; Goodglass et al., 1976
; Goodglass, 1993
), it appears that these individuals suffer from a ‘language inertia’ manifesting as inability to translate and execute a language code into speech production. Given that at least some of them can speak using speech entrainment, it would seem that lexical retrieval and speech articulation are relatively intact, although, in some cases, not within normal limits. One treatment method that has garnered considerable attention as means to enable patients with Broca’s aphasia to speak is melodic intonation therapy (Albert et al., 1973
; Sparks and Holland, 1976; Helm-Estabrooks et al., 1989
). Primarily, this method focuses on patients intoning speech (singing) but also emphasizes speech rhythm. Recently, Stahl et al. (2011)
demonstrated that providing an external rhythm (using a metronome) promotes greater speech output compared with actual intoning of speech in patients with Broca’s aphasia. Based thereon, this group suggested that damage to the basal ganglia was a crucial predictor of speech production. The basal ganglia are thought to play a crucial role in the timing of motor activity and, when damaged, could affect the timing of motor speech (Fridriksson et al., 2005
; Giraud et al., 2008
; Bohland et al., 2010
; Lu et al., 2010 a
; Stahl et al., 2011
). As can be seen in A, the basal ganglia were particularly active in both the speech entrainment and spontaneous speech conditions, suggesting that speech entrainment may provide patients with crucial speech rhythm that was affected by stroke. However, the basal ganglia were at least partially preserved in all of the patients included in this research (), suggesting that speech entrainment probably does not compensate for an impaired basal ganglia function.
The ‘Directions into Velocities of the Articulators’ model suggests that posterior regions in the left hemisphere provide feed-forward information during speech production—the supramarginal gyrus involves somatosensory targets and the superior temporal gyrus and superior temporal sulcus provide auditory targets for speech articulation (Guenther et al., 1998
; Bohland et al., 2010
; Golfinopolous et al., 2010
). In a recent review that integrates the ‘Directions into Velocities of the Articulators’ model with more conventional psycholinguistic models of speech production, Hickok (2012)
suggests coordination of the motor programs of speech in Broca’s area (specifically BA 44) and auditory syllable targets in the superior temporal gyrus and superior temporal sulcus are supported by an area in the sylvian region at the parietal–temporal boundary. The crucial lesion location that causes impaired speech is in Broca’s area (Hillis et al., 2004
; Richardson et al., 2012
). Consistent with Hickok’s (2012)
account, the patients studied here are non-fluent because the motor programmes for speech are impaired. However, speech entrainment permitted speech fluency in these individuals, raising the possibility that some other mechanism might be impaired. We suggest that this crucial mechanism relates to binding and temporal gating (in Broca’s area) that pulls along (entrains) lexical processing (BA 37) and on-line modifications of visceral functions for speech production (anterior insula/BA 47). Although lexical processing and on-line visceral modifications for speech production would be expected in the speech entrainment–audio visual and spontaneous speech conditions, normal participants who completed the functional MRI portion of this study reported having to rely heavily on prediction to execute the speech entrainment–audio visual task, suggesting that greater activity in BA 37 and anterior insula/BA 47 could reflect on-line predictions of the upcoming word and modifications to respiratory state to reflect utterance length (e.g. greater inspiration for longer sentences). Furthermore, we propose that speech entrainment works by providing an external gating mechanism via a visual route (i.e. watching a speech model) that compensates for damage to Broca’s area. However, as demonstrated in , speech entrainment does not work for all patients, especially in some cases of very severe apraxia of speech. This implies that motor speech processing needs to be at least partially preserved. Interestingly, the benefit of speech entrainment that only relies on audition is not related to apraxia of speech severity, suggesting that visual speech perception provides crucial feedback that primarily benefits non-fluent patients with milder forms of apraxia of speech. Although Broca’s area has been suggested as a crucial area for binding for syntactical processing (Hagoort, 2003
), we suggest it has a role in uniting language and articulatory processing for speech production. Using neurotransmitter receptor mapping, Amunts et al. (2010)
suggested that Broca’s area and surrounding cortex can be further subdivided beyond the traditional BA 44 and BA 45 into several different regions including dorsal (44d
) and ventral (44v
) regions of BA 44. Accordingly, it is pertinent to emphasize that the gating mechanism that we propose could potentially be attributed to a sub-region of Broca’s area, perhaps in area 44d
(), although a precise anatomical substrate of this mechanism cannot be determined based on the current data.
Consistent with the Hierarchical State Feedback Control model, fluent speech production relies on feed-forward information for auditory and somatosensory targets. Although speech entrainment may function to allow patients with Broca’s aphasia to successfully gait visceral aspects of speech and lexical processing, it is clear that other regions must compensate for damage to Broca’s area. Based on our data that showed greater activation at the left temporal–parietal junction for patients compared with their normal counterparts for ‘speech entrainment–audio visual > spontaneous speech’, it is possible that this compensation occurs primarily in and around area the sylvian region at the parietal–temporal boundary. Cortical activity in approximately the same region was reduced following speech entrainment training in the patients. This provides further evidence that this region may play a crucial role in compensation for impaired gating at the level of Broca’s area. Clearly, more data are needed to verify our account regarding speech entrainment and its relation to the importance of Broca’s area in gating for speech production. However, the present data provide reasonable evidence to support such an explanation. Furthermore, we would emphasize that any hypothesis explaining speech entrainment in Broca’s aphasia would have to account for the fact that in many such individuals, the motor programmes for speech are at least relatively intact.
Although speech entrainment in patients may be interesting from a theoretical point of view as something that could inform normal speech production, our interest was piqued as we saw this as a potential mechanism to treat impaired speech production in non-fluent aphasia. For many patients with Broca’s aphasia, speech output is severely limited. If speech entrainment promotes relatively improved speech fluency among patients, it could potentially be used as a way to practice speech over a prolonged period of time. Our results suggest that speech entrainment training not only promotes greater speech output while patients rely on speech entrainment but their ability to speak without it also improves. During the week after completion of the 6-week training phase, patients were able to produce a greater variety of words for trained and untrained scripts while using speech entrainment as well as during a spontaneous speech condition. Importantly, these findings suggest that speech entrainment training generalizes to spontaneous speech. Overall, patients were able to produce >60% more words upon training completion. If this effect is verified in future studies, speech entrainment will have to be considered as an important treatment option for this patient population. As importantly, more research is needed to understand factors that relate to treatment outcome with speech entrainment.
Many different aphasia treatment approaches have been tested with patients with non-fluent aphasia (Rosenbek et al., 1973
; Thompson and McReynolds, 1986
; Wambaugh, 2006a, b
; Crosson et al., 2009
; Lee et al., 2009
; Kiran et al., 2011
; Hurkmans et al.
, 2012). Perhaps the most similar approach to speech entrainment is melodic intonation therapy, a technique that emphasizes melody and rhythm to produce fluent speech in non-fluent patients. As reviewed by van der Meulen et al. (2012)
, a considerable number of studies have examined the effect of melodic intonation therapy on speech production in non-fluent aphasia. A vast majority of these studies relied on single case examination or small groups of patients. As is often the case with aphasia treatment research, outcome measures varied widely across studies. Nevertheless, at least one group study by Bonakdarpour et al. (2003)
did find that melodic intonation therapy training was associated with increased number of correct information units in seven patients with non-fluent aphasia and that this increase generalized to spontaneous speech, a finding that is consistent with our results. Clearly, far more research is needed to verify the effects of melodic intonation therapy and speech entrainment to treat non-fluent aphasia. Given the current results, it would seem that comparisons between these two approaches in a larger number of patients are warranted.
As we suggested earlier, even for patients whose spontaneous speech does not improve, it would seem that improvement in speech production that relies on speech entrainment could also promote greater quality of life if patients are able to practice personal scripts that can be used with simple hand-held devices in highly predictable contexts (e.g. telling one’s own story of stroke or what it is like to have aphasia) or for short, communicatively rich personally selected phrases.