|Home | About | Journals | Submit | Contact Us | Français|
Lesion-symptom mapping studies aim to make inferences about the functional neuroanatomy of spoken language understanding by investigating relationships between damage to different brain regions and the various speech perception and comprehension deficits that result. Voxel-based lesion-symptom mapping (VLSM), voxel-based morphometry (VBM), and studies focused on specific cortical regions of interest or fiber pathways have all yielded insights regarding the localization of different components of spoken language processing. Major challenges include the fact that brain damage rarely impacts just a single brain region or just a single processing component, and that neuroplasticity and recovery can complicate the interpretation of lesion-deficit correlations. Future studies involving large patient cohorts derived from multi-center projects, and multivariate approaches to quantifying patterns of brain damage and patterns of linguistic deficits, will continue to yield new insights into the neural basis of spoken language understanding.
Lesion-symptom mapping studies seek to make inferences about the functional neuroanatomy of linguistic or cognitive processes by investigating relationships between damage to different brain regions and the behavioral deficits that result. This general approach dates back to the seminal contributions of Broca (1861) and Wernicke (1874), and some even earlier observations (Benton & Joynt, 1960). Although some early authors studied series of patients (Moutier, 1908), much of the early literature was dominated by case reports of single patients (Caplan, 1987). This was a significant limitation, because it was unclear how generalizable many of the findings were.
The emergence of CT and MRI in the 1970s and 1980s made it possible to identify lesion locations before waiting potentially decades for patients to come to autopsy. This made it much more feasible to investigate brain-behavior relationships in series of patients, so that common principles of functional organization could be determined. The most prevalent approach in the first few decades of neuroimaging-based lesion-symptom mapping was to “overlay” representations of the lesions of patients with a common clinical syndrome, in order to determine which brain regions were invariably associated with the syndrome (Mohr, 1976).
In some of the earliest applications of this approach to disorders of spoken language understanding, Kertesz, Lesk, and McCabe (1977) and Naeser and Hayward (1978) overlaid the lesions of patients with Wernicke's aphasia. Both studies demonstrated consistent involvement of the left superior temporal gyrus. Kertesz, Sheppard, and MacKenzie (1982) showed that transcortical sensory aphasia, which involves a comprehension deficit of a distinct nature to that of Wernicke's aphasia, was associated with a different lesion localization, specifically posterior inferior temporal and occipital regions. The lesion overlay approach was not limited to focal damage caused by stroke. For instance, the neurodegenerative syndrome of semantic dementia, which involves impaired single word comprehension among other semantic deficits, was shown to be consistently associated with damage to the anterior temporal lobes (Hodges, Patterson, Oxbury, & Funnell, 1992; Mummery et al., 2000).
In the 1990s, researchers began to derive lesion overlays not just for clinical syndromes (that is, constellations of symptoms), but for specific functional deficits (that is, single symptoms). In the domain of language, expressive functions including naming (Damasio, Grabowski, Tranel, Hichwa, & Damasio, 1996) and speech motor planning and programming (Dronkers, 1996) were investigated. Researchers also began to go beyond lesion overlays, for instance by presenting complementary lesion overlays of patients who lacked the deficit in question (Dronkers, 1996), by computing various statistics voxel-by-voxel to quantify the impact of damage to each voxel on performance (Adolphs, Damasio, Tranel, Cooper, & Damasio, 2000; Bates et al., 2003), and by using multivariate analyses to investigate the contributions of multiple brain regions (Caplan, Hildebrandt, & Makris, 1996; Caplan et al., 2007).
The first large-scale application of a symptom-specific approach to spoken language understanding was a study by Bates et al. (2003), who investigated the neural correlates of auditory language comprehension in 101 patients with chronic aphasia due to left hemisphere stroke. Statistics were computed on a voxel-by-voxel basis, and the resulting map showed that damage to the left posterior middle temporal gyrus was most predictive of comprehension deficits, a more ventral lesion localization than might have been expected based on the classical model of the neural organization of language. Another contribution of this study was the proposal that continuous behavioral measures, rather than defined cutoff scores, should be used for lesion-symptom mapping, on the grounds that this makes best use of all available data.
Voxel-based approaches utilizing continuous behavioral measures also proved effective in neurodegenerative populations. For instance, Amici et al. (2007) investigated sentence comprehension in 58 patients with primary progressive aphasia or other neurodegenerative syndromes, and showed that deficits in the comprehension of complex syntactic structures were associated with atrophy of specific left inferior frontal regions.
Most modern studies use voxel-based approaches, in which statistical computations are carried out for each individual voxel to quantify the extent to which structural integrity of that voxel is associated with the language variable(s) of interest. Voxel-based lesion-symptom mapping (VLSM; Bates et al., 2003) and voxel-based morphometry (VBM; Ashburner and Friston, 2000) are two particularly widely used approaches, the most fundamental difference between them being whether the structural integrity of each voxel is modeled as binary, in the case of VLSM, or graded, in the case of VBM (Geva, Baron, Price, Jones, & Warburton, 2012). This section outlines the general steps that are common to VLSM and VBM analyses, before discussing other approaches which do not involve calculations for each voxel, but rather quantify the impact of damage to specific regions of interest (Caplan et al., 1996, 2007) or white matter pathways (Wilson et al., 2011).
In VLSM and VBM studies, the first step is to define a cohort of patients. In order to identify brain-behavior relationships, the group of patients needs to exhibit variability in terms of the behavioral variable(s) of interest, and also in terms of whether or not brain regions hypothesized to be important for those function(s) are damaged. Most successful VLSM and VBM studies have been based on groups of at least 50 to 100 patients.
Second, brain damage needs to be quantified on a voxel-by-voxel basis. VLSM assumes that lesions are discrete, that is, each voxel is either lesioned or it is not. This is generally appropriate for neurological populations such as stroke and resective surgery in which lesions are largely discrete. Most researchers consider manual drawing of lesions to be the “gold standard” for lesion delineation. One method is to draw each patient's lesion on a template in a single, common space (e.g. Bates et al., 2003; Damasio et al., 1996). That way, the impact of large lesions on brain morphology (e.g. expansion of adjacent ventricles) can be considered and taken into account. However, drawing lesions on templates requires great expertise and invariably involves subjectivity. A alternative is to draw each patient's lesion on their own MRI or CT image, and then warp each brain (and associated lesion mask) to standard space (e.g. Wilson et al., 2015). This method is somewhat easier to implement, because the correspondence between the patient's brain and standard space is handled by the warping algorithm (e.g. Ashburner and Friston, 2005), but it may be less accurate, because normalization algorithms do not always fare well in mapping distorted brains to standard space.
An alternative to manual delineation of lesions is the use of fully automated or semi-automated lesion segmentation algorithms (Leff et al., 2009; Seghier, Ramlackhansingh, Crinion, Leff, & Price, 2008; Tyler, Marslen-Wilson, & Stamatakis, 2005). This approach has the advantage of being objective and quick, which is important when studying large groups of patients. The correspondence between automatically and manually delineated lesions is presently only modest (Wilke, de Haan, Juenher, & Karnath, 2011), but continued advances in automated lesion delineation (Griffis, Allendorfer, & Szaflarski, 2016; Pustina et al., 2016) offer the promise of increasingly robust and valid methods that may in time perform as well if not better than manual delineation (Crinion, Holland, Copland, Thompson, & Hillis, 2013).
In contrast to VLSM, VBM is intended for neurological populations in which damage is graded. In the domain of spoken language understanding, patients with neurodegenerative disease have been particularly informative. Since damage is graded, only automated approaches are used. Segmentation algorithms are used to estimate gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) proportions in each voxel, in order to identify regions exhibiting parenchymal volume loss. The most widely used algorithm is the “unified segmentation” procedure implemented in SPM (Ashburner and Friston, 2005), which estimates GM, WM and CSF densities while simultaneously performing bias correction and warping to standard space. Many recent studies use the DARTEL algorithm (Ashburner, 2007) which performs more anatomically precise registration (Klein et al., 2009). Analyses are usually based on estimated GM density (e.g. Gorno-Tempini et al., 2004), but it is also possible to use VBM to look at WM density (e.g. Rohrer et al., 2010), or to sum GM and WM densities to obtain a general measure of parenchymal atrophy (Wilson et al., 2010).
The third step in VLSM and VBM studies is to calculate statistical relationships between damage to each voxel and behavioral variable(s) of interest. VLSM and VBM are mass univariate approaches, similar to standard fMRI analyses. This means that each voxel is analyzed independently of any other voxels. In VLSM, a t-test is performed at each voxel comparing patients with damage to the voxel to those without damage on the measure(s) of interest. In VBM, correlations are computed between GM density (or whatever metric of structural integrity is being used) and the behavioral measure(s) of interest.
Fourth, these statistical maps are corrected for multiple comparisons in order to avoid false positives due to the thousands of voxels in the brain. For VBM, Gaussian random field theory has been shown to effectively correct for multiple comparisons when using a corrected threshold, but not when using approaches based on cluster extent (Ashburner & Friston, 2000). For VLSM and VBM, many researchers have elected to control the false discovery rate (e.g. Bates et al., 2003), however it is unclear exactly how the non-independence of the multiple tests impacts this approach, and it does not control family-wise error. The author recommends the use of permutation-based thresholding methods, in which lesions and behavioral data are randomly reassigned many times in order to determine how likely the observed results would be under the null hypothesis that there is no relationship between lesion location and behavior. This approach makes no problematic assumptions about the structure of the data, and is easy to implement now that the necessary computing power is readily available (Nichols & Brett, 2002; Kimberg, Coslett, & Schwartz, 2007).
The line between VLSM and VBM is often blurred. For instance, Leff et al. (2009) used VBM in a study of stroke patients, assuming that lesions would be segmented as containing negligible GM, and that furthermore, there might be volume loss remote from the site of infarction that this approach would be sensitive to (see Geva et al. (2012) for an empirical comparison of VLSM and VBM in stroke patients, and a discussion of the advantages and disadvantages of each approach). In another study, Wilson et al. (2015) performed manual delineation of lesions, yet smoothed the resulting lesion masks in order to account for inter-individual variability. Because the lesion masks were smoothed, estimates of structural integrity were continuous rather than binary.
There are other approaches to lesion-symptom mapping that do not involve voxel-by-voxel computations. Some researchers quantify the extent to which anatomical regions of interest are lesioned, and then use these estimates as independent variables to predict behavioral measure(s) of interest (Caplan et al., 1996, 2007). With sufficient numbers of patients, multivariate analyses are then feasible, which could potentially show differential or interacting effects of damage to different regions. A limitation of this approach is that a priori hypotheses are required regarding which regions of interest to investigate. Another line of work investigates whether the integrity of white matter tracts are predictive of language deficits. Tract integrity can be quantified in terms of fractional anisotropy (Wilson et al., 2011) or other DTI metrics (Galantucci et al., 2011), especially when damage is graded as in neurodegenerative disease. Alternatively, the integrity of tracts can be quantified by determining to what extent the tracts have been impacted by lesions (Griffiths, Marslen-Wilson, Stamatakis, & Tyler, 2013; Han et al., 2013).
The structural images and behavioral measures that go into lesion-symptom mapping analyses are both static, so there are no particular limitations on the experimental conditions under which deficits in spoken language understanding can be characterized. There are, however, general limitations to these approaches that need to be considered.
The most fundamental limitation of VLSM and VBM is that the statistic for each voxel is computed independently of any other voxel, yet behavioral deficit(s) are caused not by damage to a single voxel, but by damage to one or more brain regions, each of which contains many voxels. This implies that a statistically significant relationship between damage to a voxel and a behavioral deficit can never be taken at face value. It might be that the voxel in question really is important for the behavior, but alternatively, it could be that neighboring structures, typically damaged simultaneously with the voxel, are really critical for the function in question. For instance, it has been argued that motor speech deficits are caused not by damage to the anterior insula (Dronkers, 1996), but rather by damage to the white matter pathways which run medial to it (Bonilha and Fridriksson, 2009). Or, it could be that the voxel does play a role, but that deficits only occur if other structures are damaged too. For instance, a small inferior frontal lesion does not cause persistent agrammatism, but a large fronto-parietal lesion does (Mohr, 1976). False negatives are possible too, especially for functions that are supported by both hemispheres. Even though primary auditory cortex in both hemispheres appears to be able to support early stages of spoken language understanding, an isolated lesion to left auditory cortex often does not result in any persistent deficits unless right auditory cortex is damaged too, so voxel-based methods may not show a significant relationship between left or right auditory cortex and any language measure.
These problems can be addressed by carrying out analyses which include the structural integrity of multiple brain regions as independent variables. One way to do this is to covary out the lesion status of other regions that might be important for the function of interest (e.g. Bates et al., 2003). Another possibility is to perform post-hoc multiple regression analyses that model the function in terms of the potentially interacting contributions from multiple distinct brain regions (Rankin et al., 2009; Wilson et al., 2015). As mentioned above, some authors do not use voxel-based analyses at all, but rather skip straight to multivariate analyses involving multiple regions of interest (Caplan et al., 1996, 2007). The success of multivariate analyses depends on having very large numbers of patients, because the sample must contain patients representing different combinations of regions damaged, in order to determine which region(s) are actually important for the function of interest.
Another ubiquitous limitation of lesion-symptom mapping studies is that patients generally recover to varying extents from strokes and from other neurological insults, implying that there is considerable cortical plasticity, so a lesioned brain is not just a brain with some pieces missing: it is a brain with some pieces missing and with the remaining pieces reorganized to some unknown extent. The most obvious way to address this limitation is to study patients acutely. This is challenging from a practical standpoint, and also because there are often other factors at play early after a brain injury, such as edema and hypoperfusion, that may complicate the picture in different ways. However, several lesion-symptom mapping studies have successfully demonstrated specific regions associated with various component processes of spoken language understanding in acute stroke patients (Kümmerer et al., 2013; Newhart, Ken, Kleinman, Heidler-Gary, & Hillis 2007; Newhart et al., 2012; Race, Ochfeld, Leigh, & Hillis, 2012; Rogalsky, Pitz, Hillis, & Hickok, 2008; Tsapkini, Frangakis, & Hillis, 2011) or resective surgery patients in the immediate post-operative period (Wilson et al., 2015).
Lesion-symptom mapping studies over the past decade or so have begun to paint a picture of the large-scale organization of the brain regions involved in spoken language understanding.
The neural substrates of word-level comprehension have been investigated in a number of lesion-symptom mapping studies. In particular, VBM studies in patients with primary progressive aphasia and other neurodegenerative diseases have shown that comprehension of single words is associated with atrophy of left anterior temporal regions (Mesulam, Thompson, Weintraub, & Rogalski, 2015; Mummery et al., 2000; Rogalski et al., 2011; Sapolsky et al., 2010). While some authors have argued that the tip of the temporal lobe is the most critical region (Mesulam et al., 2013), a recent multivariate analysis in 110 patients who had undergone resective surgery showed that damage to a region in the fusiform gyrus approximately 6 cm posterior to the temporal pole is predictive of semantic deficits, not the temporal pole itself (Wilson et al., 2015; see also Mion et al., 2010). In stroke patients, single word comprehension specifically has rarely been investigated. Bates et al. (2003) used a composite measure of comprehension that included word-level and sentence-level components, however an analysis using only the word-level data (the auditory word recognition subscore from the Western Aphasia Battery) showed that damage to the posterior middle temporal gyrus is similarly associated with word-level comprehension deficits (Wilson and Dronkers, unpublished observations), that is, the same region that was associated with comprehension deficits in general (see also Saygin, Dick, Wilson, Dronkers, & Bates, 2003). Another study in acute stroke patients suggested that infarction or hypoperfusion of both anterior and posterior temporal regions contributes to word-level comprehension deficits (Newhart et al., 2007). These diverging findings from PPA and stroke remain to be reconciled; one recent proposal is that word-level comprehension deficits after posterior temporal damage in stroke are caused by lesion extension into the underlying white matter, which disconnects anterior temporal regions from other perisylvian language regions (Mesulam et al., 2015).
Sentence-level comprehension has been investigated in a number of well designed and relatively well powered lesion-symptom mapping studies. The most commonly implicated regions have been left inferior frontal cortex, left superior temporal cortex, and left inferior parietal cortex, with many studies reporting one or more of these regions to be implicated in sentence-level comprehension (Amici et al., 2007; Dronkers, Wilkins, Van Valin, Redfern, & Jaeger, 2004; Fridriksson, Fillmore, Guo, & Rorden, 2015; Leff et al., 2009; Newhart et al., 2012; Rogalski et al., 2011; Teichmann et al., 2015; Thothathiri, Kimberg, & Schwartz, 2012; Wilson et al., 2011; Wilson, Galantucci, Tartaglia, & Gorno-Tempini, 2012). Some studies have suggested that specific regions within this network have specific functions, such as a role for inferior frontal cortex in processing syntactically complex sentences (Amici et al., 2007), and a role for the posterior superior temporal gyrus in auditory short-term memory in support of sentence comprehension (Leff et al., 2009; Wilson, Galantucci, Tartaglia, & Gorno-Tempini, 2012). Damage to dorsal white matter fiber pathways connecting frontal and posterior language regions has also been shown to impair syntactic processing above and beyond damage to gray matter (Wilson et al., 2011). Not all studies have supported this general picture: some authors have argued that damage to anterior temporal regions (Dronkers et al., 2004; Magnusdottir et al, 2013) or ventral pathways (Griffiths et al., 2013) can result in syntactic processing deficits, and other authors have not observed any regions to be systematically associated with syntactic deficits (Caplan et al., 1996, 2007; Caplan, Michaud, Hufford, & Makris, 2015). For further discussion, see Wilson et al. (2012, 2014).
In contrast to the rich findings on word-level and sentence-level comprehension, lesion-symptom mapping studies have made only modest contributions to our understanding of prelexical stages of spoken language understanding. This may be because much of this processing is bilaterally redundant, so genuine prelexical deficits are rare (Poeppel, 2001). Probably the most notable finding bearing on prelexical spoken language processing is a recent study of 99 stroke patients that showed that a derived factor reflecting measures including auditory lexical decision and phoneme discrimination was associated with damage to the left planum temporale and the dorsal part of the superior temporal gyrus (Mirman et al., 2015). This speech perception factor had clearly distinct neural correlates to a speech production factor which was affected by damage to immediately adjacent regions dorsal to the Sylvian fissure.
Many other aspects of spoken language understanding have been investigated using lesion-symptom mapping. Some examples include the relationship between regions involved in comprehending words and environmental sounds (Saygin et al., 2003), grammaticality judgment (Wilson and Saygin, 2004), narrative and discourse comprehension (Ash et al., 2012; Barbey, Colom, & Grafman, 2014), lexical and semantic access (Harvey and Schnur, 2015), and paralinguistic features such as voice identity, accent, and emotional prosody (Hailstone et al., 2011, 2012; Rankin et al., 2009; Rohrer, Sauter, Scott, Rossor, & Warren, 2012).
One of the most exciting new directions in lesion-symptom mapping of spoken language understanding is Price and colleagues' “Predicting Language Outcome and Recovery After Stroke (PLORAS)” study (Price, Seghier, & Leff, 2010). This is a large multi-site study in the United Kingdom which had already recruited 750 patients as of early 2015 (Seghier et al., 2016). Structural scans and behavioral data are acquired from all patients, and functional imaging data are also acquired from some patients. The data will be made available for others in the research community to analyze. What makes this study groundbreaking is the size of the patient cohort, which is an order of magnitude larger than most of the large studies to date. Such a substantial patient cohort will ensure that there are enough patients with similar yet distinct lesions, so that subtle effects of lesion size and distribution on different aspects of language processing can be quantified. Multivariate analyses in which behavior is predicted based on multiple regions of interest will become feasible when the sample size is large enough.
A second promising direction is the application of machine learning techniques and other multivariate approaches to lesion-symptom mapping. Machine learning algorithms such as support vector machines have been used to uncover relationships between distributed lesions and clinical syndromes including stroke (Saur et al., 2010), primary progressive aphasia (Wilson et al., 2009), and other neurodegenerative diseases (Klöppel et al., 2008). Recent studies have investigated relationships between lesions and specific symptoms using similar approaches (Xing et al., 2016; Yourganov, Smith, Fridriksson, & Rorden, 2015; Zhang, Kimberg, Coslett, Schwartz, & Wang, 2014). On the behavioral side, constellations of deficits have also been analyzed from a multivariate perspective using principal components analysis (Butler, Lambon Ralph, & Woollams, 2014; Mirman et al., 2015). These new approaches should allow researchers to overcome some of the limitations of the mass univariate approach that have been identified (Inoue, Madhyastha, Rudrauf, Mehta, & Grabowski, 2014; Mah, Husain, Rees, & Nachev, 2014).
Finally, lesion-symptom mapping needs to be used in conjunction with other cognitive neuroscience techniques such as task-based and connectivity-based functional MRI, perfusion imaging, and diffusion tensor imaging in order to probe the functionality of surviving tissue and its potential reorganization (Saur and Hartwigsen, 2012; Specht et al., 2009; Warren, Crinion, Lambon Ralph, & Wise, 2009; Wilson et al., 2014). Combining structural and functional neuroimaging modalities will provide a more complete picture of which brain regions are critical for different aspects of spoken language understanding, as well as the potential of other regions to carry out these functions when the preferred regions are damaged.
Funding: This work was supported by the National Institute on Deafness and Other Communication Disorders (NIH R01 DC013270).
The program most commonly used to manually delineate lesions is mricron (Rorden & Brett, 2000; http://people.cas.sc.edu/rorden/mricron/index.html). Another excellent program that can be used for lesion delineation is ITK-SNAP (Yushkevich et al., 2006; http://www.itksnap.org).
Intersubject normalization is most often performed with SPM (Friston, Ashburner, Kiebel, Nichols, & Penny, 2007; http://www.fil.ion.ucl.ac.uk/spm) using either the unified segmentation (Ashburner & Friston, 2005) or DARTEL (Ashburner, 2007) algorithms. Another good choice is ANTS (Avants, Epstein, Grossman, & Gee, 2008; http://stnava.github.io/ANTs).
VLSM and VBM can be carried out with the author's MATLAB toolbox vlsm (Bates et al., 2003; http://langneurosci.mc.vanderbilt.edu/resources.html), with NPM (Rorden, Karnath, & Bonilha, 2007; included with mricron), or with any mainstream neuroimaging analysis package, such as SPM (Friston et al., 2007; http://www.fil.ion.ucl.ac.uk/spm). The Statistical NonParametric Mapping (SnPM) toolbox (Nichols & Holmes, 2002; http://warwick.ac.uk/snpm) is recommended for use with SPM.