|Home | About | Journals | Submit | Contact Us | Français|
Lexical access in language production, and particularly pathologies of lexical access, are often investigated by examining errors in picture naming and word repetition. In this article, we test a computational approach to lexical access, the two-step interactive model, by examining whether the model can quantitatively predict the repetition-error patterns of 65 aphasic subjects from their naming errors. The model’s characterizations of the subjects’ naming errors were taken from the companion paper to this one (Schwartz, Dell, N. Martin, Gahl & Sobel, 2006), and their repetition was predicted from the model on the assumption that naming involves two error prone steps, word and phonological retrieval, whereas repetition only creates errors in the second of these steps. A version of the model in which lexical-semantic and lexical-phonological connections could be independently lesioned was generally successful in predicting repetition for the aphasics. An analysis of the few cases in which model predictions were inaccurate revealed the role of input phonology in the repetition task.
Among the fundamental assumptions of cognitive neuropsychology are that complex processes consist of components, that the same components participate in different tasks, and that brain damage may affect the components to different degrees (e.g. Caramazza, 1984; Marin, Saffran, & Schwartz, 1976; Rapp, 2001; Shallice, 1988). Here, we employ these assumptions in an investigation of lexical access in language production. We examine the components of lexical access by relating them to aphasic individuals’ performance on two tasks, picture naming and word repetition.
This article is a companion to Schwartz, Dell, N. Martin, Gahl, and Sobel (2006), which described a study of lexical access deficits in 94 aphasic individuals. That study combined the case-series method, which aims to explain patient variation on a task, with computational modeling. Each patient’s performance on picture naming was related to a model of lexical access, the interactive two-step model (Dell, Schwartz, N. Martin, Saffran & Gagnon, 1997). Two contrasting versions of this model were compared in their ability to characterize the naming error patterns of each patient. The more successful version, the semantic-phonological model (Foygel & Dell, 2000), was then further evaluated by testing predictions regarding other properties of the patients’ naming errors.
The goal of the present paper is to apply the model to a different task, word repetition. Can the model parameters determined from the naming data predict the pattern of errors in repetition? To make such predictions, one must propose a theory of the relation between the naming and repetition tasks. We articulate and test such a theory here. In so doing, we also test basic assumptions about lexical access in production, most importantly, the assumption that it includes a component in which meaning is mapped onto a holistic lexical item (word retrieval) and one in which the item’s phonological form is retrieved and processed (phonological retrieval).
The distinction between word and phonological retrieval is, in one form or other, part of many theories of lexical access (e.g. Caramazza, 1997; Cutting & Ferreira, 1999; Dell et al., 1997; Dell, 1986; Garrett, 1975; 1980; Griffin & Bock, 1998; Harley, 1993; Kempen & Huijbers, 1983; Levelt, Roelofs, & Meyer, 1999; Levelt, 1989; Laine, Tikkala, & Juola, 1998; N. Martin, Dell, Saffran & Schwartz, 1994; Rapp & Goldrick, 2000; Roelofs, 1992; but see Lambon-Ralph, Sage & Roberts, 2000). This assumption is instantiated in a particular way in the interactive two-step model of production. Word and phonological retrieval are distinct serial ordered steps, but the steps are not separate modules. Because retrieval in the model is achieved through interactive or bi-directional spreading of activation through a lexical network, information relevant for word access (e.g. semantic features) can influence phonological retrieval, and phonological information can affect word retrieval. In this paper, we are primarily concerned with the distinction between the steps and the relation of these steps and their attendant lexical connections to the tasks of word naming and word repetition. We propose that naming a word from a picture, or from other cues to meaning such as a definition, uses both steps. Repetition, that is, hearing a word and then repeating it, uses only the second step. Consequently, if we can characterize a patient’s deficit based on naming in a precise manner, we ought to be able to predict something about his or her repetition. To the extent that predictions are successful, the assumptions regarding the characterization of the patient and the relations between the tasks are supported.
In the remainder of the introduction, we first review the literature on repetition, particularly the data and theory that have come from analyzing repetition deficits. Then we describe the interactive two-step model and its different versions, and the model’s potential accounts of repetition.
Repeating spoken input, whether it consists of a single word, a word list, or a sentence, is a relatively easy task for unimpaired speakers. There are limits to this ease, however, when the amount of information to be repeated stresses short-term memory capacity (e.g. long sentences or word lists, or complex nonwords). Thus, many studies of repetition in both aphasic and unimpaired speakers have been undertaken as studies of memory, rather than studies of language processing or production (e.g. Hulme, Maughan & Brown, 1991; Warrington & Shallice, 1969). Much of this work has been inspired by the working memory model of Baddeley and Hitch (1974), in which the immediate reproduction of verbal stimuli is assumed to involve the decoding, buffering, and rehearsal of phonological strings.
Research on list repetition from a memory perspective has led to two important conclusions: (1) Immediate repetition of sub-span material is heavily dependent on phonological representations of some sort as proposed by Baddeley and Hitch, but (2) repetition performance bears traces of influence from multiple levels of language processing, including lexical and semantic influences (see Monsell, 1985; Saffran, 1990 for reviews).
Much evidence supports the primacy of phonological representations in list repetition. Phonological similarity (e.g. Conrad & Hull, 1964), word length (Baddeley, Thomson & Buchanan, 1975), and phonotactic probability (Gathercole, Frankish, Pickering, & Peaker, 1999) all affect verbal span, indicating the involvement of representations that correlate with sound structure. Repetition, however, is not just a matter of phonology or other sound-based representations. The role of lexical representations in verbal span is evident from better performance for words over nonwords (Hulme et al., 1991) and word frequency effects (e.g. Hulme, Roodenrys, Schweickert, Brown, A. Martin, & Stuart, 1997). Semantic factors such as the semantic similarity of list members (Shulman, 1971; Crowder, 1979; Poirer & Saint Aubin, 1995) and category membership (Brooks & Watkins, 1990) also influence span to some extent. These extra-phonological influences have led to the conclusion that the immediate repetition of verbal lists is mediated by multiple linguistic levels (e.g. Berndt & Mitchum 1990; Gupta, 1996; Jefferies, Frankish, & Lambon Ralph, 2006; Patterson, Graham & Hodges, 1994; R. Martin, Shelton & Yaffee, 1994; Saffran, 1990; Saffran & N. Martin, 1990).
Studies in pathological populations support this conclusion. In acquired aphasia, where span impairments are most apparently associated with deficits in phonological input and output processing (e.g. N. Martin & Saffran, 2002; N. Martin & Ayala, 2004) and phonological storage (e.g. R. Martin, Shelton, & Yaffee, 1994), semantic influences are nevertheless evident in the form of imageability effects and occasional semantic errors (N. Martin & Saffran, 1990; 1997;N. Martin Saffran, & Dell, 1996; Trojano, Stanzione, & Grossi, 1992). In semantic dementia, a form of progressive aphasia that affects the semantic representation of words and objects, word spans are reduced for semantically degraded words, compared to known words (Patterson et al., 1994; Jefferies, Jones, Bateman & Lambon Ralph, 2004; 2005; Knott, Patterson & Hodges, 1997). Moreover, serial recall errors occur in the form of phoneme migration within and across words (Knott, Patterson, & Hodges, 1997; Patterson et al, 1994; Saffran, Coslett, N. Martin, & Boronat, 2003), something that occurs in normal repetition only when the lists contain or comprise nonwords (Jefferies et al., 2006; Trieman & Danis, 1988). This has given rise to the idea that semantic knowledge provides essential support to the coherence of phonological representations in short-term memory (the semantic binding hypothesis; Patterson et al., 1994; Jefferies et al., 2006). On a more general note, the evidence from semantic dementia provides strong confirmation that multiple linguistic levels are involved in multi-word retention and repetition.
Can this same multiple linguistic-levels model be valid in the repetition of a single word? Because unimpaired speakers (and those with semantic dementia) repeat single words so accurately, little can be learned from their repetition errors. For some individuals with aphasia, though, single-word repetition can be quite difficult. Breakdown can, in principle, occur during input processing (speech or word recognition), memory retention, or output processing (production). Classic taxonomies of aphasia (Geschwind, 1965; Goodglass & Kaplan, 1983; Benson & Ardilla, 1996) distinguish aphasic syndromes based on impairment or preservation of repetition, and reflect these three potential points of disruption. Patients whose lesions affect perisylvian areas exhibit repetition difficulties that could be related to impaired input processing (Wernicke’s aphasia), impaired transmission of information to output systems (conduction aphasia), or impaired output systems (Broca’s aphasia). The transmission deficit of conduction aphasia has been attributed to impaired phonological encoding during output (Kohn, 1984) and to a selective impairment of auditory-verbal short-term memory (Warrington & Shallice, 1969). Other aphasic syndromes are notable for relatively preserved single-word repetition despite other production or comprehension difficulties. In Benson and Ardilla’s (1996) classification scheme, these syndromes arise from extrasylvian lesions and include transcortical sensory aphasia (impaired comprehension and word retrieval), transcortical motor aphasia (difficulty initiating speech), and anomic aphasia (word retrieval deficits).
Our approach to aphasic repetition differs from those based on classic aphasic syndromes. In the first place, we emphasize the specific contribution of output impairments to repetition errors for all aphasic subjects regardless of how they are categorized, although we recognize that input processing difficulty matters in some cases. More importantly though, our approach does not explain patient variation in repetition by appealing to the clinical syndromes, but rather by determining the extent to which particular linguistic levels (phonological, lexical, semantic) and their attendant connections are damaged. In other words, our model explains aphasic repetition through a particular instantiation of the multiple linguistic-levels model.
The multiple-levels approach to aphasic single-word repetition is supported by a variety of studies. As was suggested by research from the memory perspective, phonological representations are critical. When phonological representations are spared in language disorders such as transcortical aphasia (N. Martin & Saffran, 1990; Berthier, 1999) or semantic dementia (Whitaker, 1976; Schwartz, Marin & Saffran, 1979), repeating single words is typically not difficult; when this level is impaired, repetition is often poor (Kohn, 1984; Caplan, Vanier & Baker, 1986). Other linguistic levels matter as well, though. Concrete (or highly imageable) words and high-frequency words are more accurately repeated than their abstract or uncommon counterparts, demonstrating lexical and semantic influences (e.g. N. Martin & Saffran, 1997; Hanley & Kay, 1997; Hanley, Kay & Edwards, 2002). Moreover, if there is evidence of semantic-level damage as well as phonological impairment, as in the syndrome known as deep dysphasia (e.g. Howard & Franklin, 1988), semantic as well as phonological errors can occur in repetition of a single word.
If, as hypothesized, single-word repetition depends on phonological representations that are influenced by processing at other levels, one needs a general theory of language processing to characterize the task, a theory that specifies the nature of phonological input and output representations, and which allows for participation from the lexical and semantic levels. The interactive two-step model of word production provides some of what is required. Its phonological retrieval step is responsible for building a phonological representation that is used for output, and the model’s spreading activation assumptions allow the lexical and even the semantic levels to affect phonological retrieval. Because the model deals with production, and specifically the naming task, though, it requires additional assumptions about input processing to handle repetition. In the remainder of this section, we describe how the interactive two-step model produces words from meaning, and review three applications of the model to single-word repetition (Dell, et al., 1997; Hanley et al., 2004; N. Martin et al., 1994). These applications all assume that word repetition uses the phonological retrieval step of the model, but differ in their assumptions about phonological input processing.
Because the model was developed to explain picture naming, it maps from word meaning, represented as a set of semantic features, to the word’s phonological form. The knowledge required for this mapping is contained in a hierarchical network as illustrated in Figure 1. Network layers include semantic features, words, and phonemes, with adjacent layers linked by bi-directional excitatory connections. Word retrieval, the first step, begins with the activation of the target word’s semantic features. Let us assume that the target word is CAT, and so the activation of each semantic-feature unit for CAT would be set to a certain amount (See, Dell et al., 1997; Foygel & Dell, 2000; Ruml & Caramazza, 2000, for details; and Dell et al., 2004, and Schwartz et al., 2006, for recent changes in the implementation). The model assumes that the initial activation of the semantic-feature units proceeds normally in aphasia. That is, aphasic subjects can correctly identify a picture of a cat and retrieve a representation of the word CAT’s meaning. (This assumption is clearly not always true, Rapp & Goldrick, 2000; Schwartz et al., 2006). Activation then spreads throughout the network, both from semantics downward to words and phonemes, and from phonemes upward to words and semantics. The spread depends on the weight or strength of the connections and the rate of decay of activation. In addition, during each time step, activation levels are randomly perturbed. After a fixed period of time, the most activated word of the proper syntactic category is selected. In a task in which pictured objects are to be named, nouns are to be selected. The selection of the most activated noun completes the word retrieval step.
Phonological retrieval starts with the selected word unit being given an extra jolt of activation. Activation spreads again throughout the network, again in both directions, upward to semantics and downward to phonology. After this, the most active phoneme units are selected, completing the word-form retrieval process. Errors in naming can occur during either word or phonological retrieval, and result when a nontarget word or phoneme has a higher activation than that of the correct item and is thus selected instead. Spreading activation naturally leads to the activation of units that represent semantically or formally similar words. This activation, when combined with random noise, leads to error. Errors in word retrieval are necessarily lexical and include semantic errors (DOG), formal errors (MAT), mixed errors (RAT), or unrelated word errors (LOG). Nonword errors (CAG) occur during phonological retrieval. Formal and mixed errors can also occur in this step.
Applying the model to aphasic naming requires a theory of the nature of the damage. Dell et al. (1997) associated damage with a global decrement in the model’s connection weights (which were all the same in the actual implementation) or a global increase in the rate with which activation decays. This is the weight-decay version of the model. Decreasing the weights, for example, from an assumed normal value of .1 to .01, or increasing the decay rate from .5 per time step to, say, .9 causes activation levels throughout the network to become small. Thus the “signal” becomes lost in the random noise and errors become more likely. Changing the weight and changing the decay, though, have different effects on the error pattern, with weight lesions tending to promote more nonword and unrelated errors, and decay lesions creating mostly semantic, mixed, and formal errors. (See Foygel & Dell, 2000, for an analysis of why this happens). To apply the model to the naming errors of aphasic subjects, Dell et al. (1997) assigned each aphasic subject in their study a value on the weight parameter and a value on the decay parameter, effectively diagnosing each subject as having either a weight lesion, decay lesion, or both. The parameter assignment or “fitting” process involved choosing weight and decay values that made the model’s error pattern as close as possible to that of each patient. While Dell et al. reported that the model was able to fit most of the patients, there has ensued lively debate on such questions as how best to measure and evaluate the model’s fit to patient data and what constitutes support for the model’s assumptions regarding interactivity (Dell et al., 2000; Foygel & Dell, 2000; Rapp & Goldrick, 2000; Ruml & Caramazza, 2000; Ruml et al, 2000; Ruml et al., 2005). Rapp and Goldrick (2000), for example, have presented an alternative model in which the upward flow of activation is considerably limited, and Ruml et al. (2000, 2005) have questioned whether the extent to which the model fits the data is strong enough to support claims for interactivity. Readers are referred to these papers, and to Schwartz et al, (2006), for extended discussion and additional data. Here, we focus on another debated issue, namely, the adequacy of Dell et al.’s theory of the nature of aphasic deficits.
Foygel and Dell (2000) proposed an alternative way to lesion the model in which the global weight parameter is divided into a semantic weight (s) and a phonological weight (p). That is, lesions can independently affect the lexical-semantic and the lexical-phonological connections. This is the semantic-phonological version of the model. Like the weight-decay model, it has two lesionable parameters. (Although activation decays in the semantic-phonological model, the decay rate is not considered to be lesionable). Lesioning the semantic weight tends to promote lexical errors, while phonological lesions create mostly nonword errors.
The weight-decay and semantic-phonological models’ ability to account for naming error patterns has been compared in several studies. Some studies failed to show a clear advantage for one model version over the other (Foygel & Dell, 2000; Ruml et al., 2000). The companion paper to this study, Schwartz et al. (2006) is the largest model comparison study of aphasic naming to date. They found that the semantic-phonological model enjoys a small, but clear, advantage in accounting for the naming errors made by 94 aphasic subjects. In a re-analysis of published data, they found that all prior studies tended to favor that model as well. The definitive test, however, should come from applying the models to single-word repetition. As hypothesized, repetition is primarily a phonological task, unlike naming which involves word retrieval from meaning. Given this, the two versions of the models make different predictions regarding the relation between naming and repetition. In the weight-decay model, all deficits are global. Consequently, a naming pattern featuring many errors of any sort implies a large global deficit that includes the phonological level, and thus one expects to see poor repetition. In the semantic-phonological model, poor naming predicts poor repetition only if the poor naming implicates a lesion to the phonological weights. The present study uses the weight-decay and semantic-phonological model parameters derived from Schwartz et al. to predict the repetition errors of patients from that study, thereby providing the critical test of the competing model versions. Before we describe our study, though, we need to discuss the specifics of how a naming model can be applied to repetition.
Using an early implementation of the weight-decay version of the model, N. Martin et al. (1994) sought to explain the naming and repetition errors of NC, an aphasic individual who exhibited the unusual deep dysphasic pattern, which includes semantic errors in repetition. NC’s naming errors were well characterized by globally reducing the decay rate. To apply the model to repetition, Martin et al. assumed that the phonological units and lexical-phonological connections in the naming model were used for both input and output processing. Repetition was simulated by a two-step process of, first, word recognition and then word production. In the first step, the phoneme units were activated, this activation spread throughout the network, and the most activated word was selected. The second step was just the phonological retrieval step that is also involved in naming, the only difference being that the word unit given the jolt of activation was the recognized word from the first step. The model was able to account for both the overall level of NC’s repetition as well as his error pattern. Semantic errors occurred during the word recognition step and were especially promoted by the decay lesion.
The assumption of phonological representations that are shared between input and output was successful in accounting for NC’s repetition. More generally, though, neuropsychologists believe that phonological input and output are sometimes dissociated and, particularly, that it is common to find cases in which phonological output is disrupted but input processing is not (see R. Martin, 2003, for review). To account for such cases with a model such as that shown in Figure 1, one can assume separate input and output units at the phonological level, each with separate connections to the word level. Dell et al. (1997) applied the separate input/output approach to naming and repetition by making the perfect recognition assumption: Aphasic subjects repeat single words by correctly recognizing the word, and then producing the word using the phonological retrieval step of production. Like the approach of N. Martin et al. (1994), repetition entails word recognition followed by a word-production step that corresponds to the phonological retrieval step. The difference is that the word recognition step is assumed to be error-free. For perfect recognition to occur in the face of impaired phonological output, the model must assume that input processing to at least the word level involves separate units and connections from those dedicated to output (e.g., Caramazza, 1988).
Once the perfect recognition assumption is made, it is trivial to predict repetition from naming. The model’s parameters (either global weight and decay, or s and p, depending on model version) are set based on the naming error pattern. Then the model is simply run through only the phonological retrieval step. There is a jolt of activation to the correct word unit and this activation reverberates among phonological, lexical, and semantic units, and eventually the most active phoneme units are chosen to represent the response. The selected phonemes are coded in relation to the target (correct, semantic, nonword, etc.) and the resulting response proportions are the model’s prediction of what the patient’s repetition pattern should be. Dell et al. (1997) applied the weight-decay model and the perfect recognition assumption to the repetition of eleven patients and found that the model gave a good account of the repetition of nine of them. The account was somewhat improved when Foygel and Dell (2000) replaced the weight-decay version with the semantic-phonological one. Of course, the perfect recognition assumption has claims to validity only in patients who demonstrate good auditory input processing. This important caveat will be fleshed out in later sections.
The approaches to repetition of N. Martin et al. (1994) and Dell et al. (1997) are single-route lexicalist models. Words are repeated by recognizing them as words and then producing them. Clearly such an account cannot explain people’s ability to repeat nonwords without additional assumptions. One such assumption is that people can map directly from input to output phonology through a nonlexical route. Does this hypothesized nonlexical route play a role in word repetition? According to some (e.g. Hanley, Kay, & Edwards, 2002; Hillis & Caramazza, 1991), words are repeated by summing activation from both a lexical and a nonlexical route. This dual-route mechanism was grafted onto the interactive two-step model by Hanley et al. (2004). The semantic-phonological model with the perfect recognition assumption was augmented with a nonlexical source of activation feeding to the target output phoneme units. Thus the phonemes received activation from both the recognized word and the nonlexical route. The strength of the nonlexical route was estimated by examining patients’ ability to repeat nonwords. Hanley et al. then showed that two patients whose repetition was unexpectedly good given their poor naming performance and which was underpredicted by a single lexical route was adequately explained by the model with two routes.
For the present study, 65 of the patients from the Schwartz et al. (2006) naming study were tested on a word-repetition version of the naming test (the Philadelphia Repetition Test), and some ancillary tests of input processing. Parameters derived from the naming study are used to predict repetition, assuming perfect recognition and a single lexical route. The weight-decay and semantic-phonological approaches to lesioning are compared. Although we directly simulate only the single-route-with-perfect-recognition approach to repetition with all of the patients, the results will bear on the shared input/output approach, as we explain later. Moreover we will simulate the dual-route approach with a subset of the sample, for which we have data on nonword repetiion.
Sixty-five of the 94 individuals who took part in the computational study of naming (Schwartz et al., 2006) were also tested on repetition and ancillary processes. These 65 comprise the study participants. All had adequate hearing (with or without amplification aids), as determined informally and, in questionable cases, by pure tone audiometry and a functional hearing protocol designed for the elderly (Weinstein, 1986). Information about the general characteristics of the sample is provided in Schwartz et al., but the 65 patients’ clinical classifications and naming error patterns are reproduced here in Table 1, using the same pseudo-initials to identify them that Schwartz et al. used. As the Table indicates, a variety of classically defined aphasia types are represented in this sample. Additionally, the participants ranged widely in the severity of their aphasia as will be apparent in their performance on background tests and in repetition. Importantly, patients were not excluded from the original sample or the present subset because of their naming or repetition data (see Schwartz et al. for details of subject selection and naming error patterns). This contrasts with previous studies, which did not include patients who made many naming omissions or who had articulatory deficits (e.g. Dell et al., 1997), or which over-represented patients with particular error patterns (Ruml et al., 2000; 2005). Data were gathered in the years 1997 to 2002, under the same IRB-approved protocol as the naming study.
These 65 subjects were administered the Philadelphia Repetition Test, which tests the ability to repeat single words. The stimuli are the 175 target names for the pictures in the Philadelphia Naming Test, but the stimuli are presented in a different randomized order from that used in the naming test. Target names are all nouns and are 1 to 4 syllables in length. Noun frequency of the target words ranges from 1–2110 occurrences per million in written text (Francis & Kucera, 1982). Most target words are in the low frequency range (1–24 occurrences per million).
Stimuli for the word repetition test were recorded onto audiotape at a rate of 1 per 5 sec. The participant’s task was to repeat each word immediately after hearing it. In rare cases when subjects were still responding towards the end of the interval, the tape was stopped to allow the response to be completed. Requests for repetition of the stimulus were denied.
All responses were tape-recorded. During the test, the examiner, an experienced speech-language pathologist, transcribed the responses. The audiotapes were then transcribed by a research assistant. Any discrepancies between the two transcriptions were resolved jointly by the two transcribers.
The categories used to score repetition responses were identical to those used to score the naming responses in Schwartz et al. (2006). One response was scored per trial; when multiple responses were given, the first complete response was scored.
For most subjects, responses were marked as correct if all sounds were produced accurately. In our 1997 study, we had excluded participants with apparent articulatory-motor impairments because of the difficulty in distinguishing articulatory and phonological errors (McNeil, Robin, & Schmidt, 1997). In this and the companion naming study, we chose to include them but to score their responses with some leniency. Thus, we did not score as errors any minor articulatory distortions that were consistent for that patient, and we scored as correct those responses that deviated by the addition, deletion, or substitution of a single consonant.
Responses that were not scored as correct were assigned to one of the categories described below:
Whole word error that is a synonym, close associate, coordinate, subordinate, or superordinate of the target word.
Whole word error that is phonologically related to the target word. The criterion for formal similarity is that the target and error begin or end with the same phoneme, have another phoneme in common in corresponding syllable or word positions, or share more than one phoneme, other than unstressed vowels, in any part of the word.
Whole word error that meets both semantic and formal criteria.
Any nonword, including both nonwords that are phonologically similar to the target (e.g. bucket → bucken) and those that are not.
Whole word error that is neither semantically nor formally related to the target word.
We also used the additional minor error categories used by Schwartz et al.—failures to respond, descriptions, and miscellaneous responses. These comprised only 1.5% of repetition responses.
The perfect-recognition assumption predicts that that the naming model should be able to predict repetition performance only for those individuals whose input phonological processing is intact. To ascertain the input processing abilities of our participants we administered the following tests of the integrity of their auditory-phonological input processing:
This test measured the ability to discriminate the spoken name of a picture from semantically or phonologically related foils. Stimuli were 162 of the 175 pictures from the Philadelphia Naming Test, each of which was paired, on different occasions, with the target name, a close semantic foil, a semantically remote foil, a phonologically close nonword foil, and a phonologically remote nonword foil. The stimuli comprised two lists, both of which each subject experienced in counterbalanced order over the several sessions required to complete testing. In List 1, half of the items were assigned to the semantic subset and the remaining half to the phonological subset. Each picture in the semantic subset appeared three times, once with a matching auditory stimulus, once with a semantically close foil, and once with a semantically remote foil. Each picture in the phonological subset also appeared three times, once with a matching auditory stimulus, once with a phonologically close foil, and once with a phonologically remote foil. List 2 consisted of the complementary set of semantic and phonological items. Thus, there were twice as many nonmatch as match trials. For presentation, the semantic and phonological foils were intermixed within each list in order to prevent subjects from simply treating the items with nonword phonological foils as a lexical decision task.
Pictures were black and white line drawings; auditory stimuli were recorded on tape at an interval of 1 per 8 seconds. On each trial, the experimenter exposed a picture in advance of the paired auditory stimulus and left it exposed until the subject responded yes (match) or no (nonmatch). At that point, the next picture was exposed, in advance of its paired auditory stimulus. Subjects run early in the experiment received a computerized version of this experiment, which we ceased to use on account of software-hardware incompatibility. The two versions gave roughly comparable results.
The Philadelphia Name Verification Test yields separate measures of semantic and phonological input processing. Here we present just the measure of phonological input processing, defined as the rate of correct rejection of phonological foils (close and remote). The computerized version of the test was administered to 29 age-matched control participants; they rejected phonological foils at a rate of .987 (SD .013).
This test was adapted from N. Martin and Saffran (1992). Participants listened to two spoken words or nonwords presented on audiotape. The two items were either the same or differed by one or two phonemes. There were 20 word pairs and 20 nonword pairs, 10 pairs with identical items and 10 with items that differed. For different pairs, the phonemes that did not match were sampled equally from initial, medial and final positions. Additionally, the interval between the first and second items in the pairs was varied. In one condition, the items occurred in immediate succession; in the other, they were separated by a 5-second interval, during which the examiner and subject counted together to 5. The same stimulus pairs were presented in both conditions, in two counterbalanced lists. The measure of interest was the number of correct discriminations (match and no match) in each interval condition.
This is a word-nonword discrimination task, containing 80 nonwords and 80 words, the latter equally divided into the following categories: High Imageability/High Frequency, High Imageability/Low Frequency, Low Imageability/High Frequency, Low Imageability/Low Frequency. The stimuli were presented via tape recorder and the subject was to determine whether each is a word or not a word. The rate of correct acceptances of words and the rate of false acceptances of nonwords were treated as separate measures of input phonological processing (e.g., Allport, 1984; R. Martin, Breedin, & Damien, 1999, N. Martin & Saffran, 2002).
To summarize, there were five distinct measures of input processing: (1) the rate of correct rejections of phonological foils on the name verification test, (2) percentage correct for phoneme discrimination without a filled interval, (3) percentage correct discrimination with a 5-second filled interval, (4) rate of acceptance of words in auditory lexical decision, and (5) rate of correct rejection of nonwords in auditory lexical decision.
Thirty of the participants were administered a second repetition test, designed to assess nonword repetition. The sixty nonwords to be repeated were derived from sixty concrete one- and two-syllable words (mean = 1.37 syllables) that ranged in frequency from less than one per million to 717 per million (Kucera & Francis, 1982). The nonwords were created from the words by altering one consonant and one vowel, with the constraint that the stimuli remained phonologically legal. The words from which the nonwords were generated were included as fillers in the test so that the participants would not assume that the stimuli were solely lexical or solely nonlexical. Performance on the fillers was not analyzed for this study. The stimuli were organized into two lists of 30 words and nonwords each so that a derived nonword and its word counterpart were not on the same list. Lists were presented on separate days. The stimuli were presented auditorily by the examiner. The participant’s task was to repeat the word or nonword immediately after hearing it. Incorrect responses to the nonwords were scored as either word or nonword outcomes.
Table 2 shows the average distribution of correct and error responses on the Philadelphia Repetition Test for the 65 participants, and compares it to the average distribution of these participants’ responses on the Philadelphia Naming Test, based on the data from Schwartz et al. (2006). Only 1.5% of repetition responses did not fall into one of the six categories presented in the table, and 90% of these were failures to respond. As in Schwartz et al., the six response proportions presented here and in all other such tables are normalized; the responses outside of these categories are removed and the proportions are recalculated so that they add to 1.0.
The response distributions reflect our expectations. Repetition was, on average, more accurate than naming. If, as we claim, naming errors are generated during word and phonological retrieval, and repetition errors only occur during phonological retrieval, there are more chances for error in naming. Moreover, repetition errors were largely confined to the nonword and formal categories, while semantic, mixed, and unrelated errors were common in naming. Again, this is what would be expected if repetition errors occurred during the phonological retrieval step. Errors in this step should largely consist of sound deviations from the target. If these created words, they would be classified as formal errors; if not, they would fall in the nonword category.
The model’s claim that repetition occurs during phonological retrieval is only viable to the extent that errors do not occur during input, that is, that the perfect recognition assumption is true. We will confine our tests of the models to those patients whose ancillary test performance did not produce clear evidence of an input-processing deficit. To identify excluded patients, all patients’ scores on each of the five input- processing tests were converted to z-scores, in which a positive score means that performance was better than the patient average. We then identified as input-processing-impaired any patient who obtained a z-score of − 1.5 or less on at least two of the five measures. Notice that we are biasing against labeling patients as impaired; the z-scores are relative to the other patients, not normal controls. This approach is conservative with respect to the goal of supporting the model, because it includes patients whose input processing probably falls short of normal. The six individuals presented in Table 3 were considered to have impaired input processing by our criterion. Notice that each of them meets the criterion and has negative scores on at least four of the five measures, so the evidence for impairment is reasonably consistent in these cases.
Preliminary to predicting individuals’ repetition, we first determined whether the model can predict the mean repetition pattern (Table 2) from the mean naming pattern. This allows us to see whether the overall difference between naming and repetition is consistent with the two versions of the model and with the perfect recognition assumption. For this analysis, we included all 65 subjects. Later when we apply the model to individuals, we will distinguish those with input processing impairments from the rest.
The mean naming pattern was treated as if it came from a single subject, and parameters for the weight-decay and semantic-phonological models were derived as in prior work (see Dell et al., 2004; Schwartz et al., 2006). Then, for each model version, repetition was predicted assuming that the recognition of the target word is correct (perfect recognition assumption) and that its production is estimated by the parameterized model’s phonological retrieval step (single-lexical route to repetition). The results of this analysis are presented in Table 4.
The table reveals that both the weight-decay and semantic-phonological model versions can predict the mean repetition pattern fairly well. Crucially, both versions correctly predict the rarity of semantic, mixed, and unrelated errors in repetition, compared to naming. This is because these errors, according to the model, occur to an overwhelming extent during the word retrieval step in naming and this step is absent in repetition. The model versions also correctly predict the greater likelihood of nonword over formal errors in repetition. In the data, this probably occurs because of the relative opportunities for phonological perturbations to create words and nonwords. Most legal phonological strings in English are nonwords rather than words and the model’s lexicon is set up to reflect this fact. Finally, both model versions predict approximately the right level of performance. Although the prediction from the semantic-phonological model is slightly more accurate, at this point the safest conclusion is that both versions of the model are consistent with the mean repetition pattern and its differences from naming. The true test of the models should thus come in their ability to predict individuals, and we turn to this in the next section.
Weight-decay and semantic-phonological model parameters derived from the individual naming patterns were taken from Schwartz et al. (2006) and used to predict repetition, under the perfect recognition and single-lexical route assumptions. The obtained naming and repetition response proportions, model parameters, and the model repetition predictions are shown in Table 1. In addition, for each model, a measure of prediction accuracy, the root mean squared deviation (rmsd) is reported. This is simply the square root of the average of the squared deviations between each predicted proportion and the corresponding obtained proportion. Roughly speaking, an rmsd of .06 means that the average deviation across the six response proportions is .06.
We first consider the 59 participants who were not identified as input impaired. According to the model, their repetition should be predictable from their naming. To a considerable extent, it was. But there was a clear difference in the model versions, with the mean rmsd for the semantic-phonological model (.052) significantly lower than that of the weight-decay model (.070), p < .03 by a paired t-test. Figures 2 and and33 provide a picture of the differences in the prediction accuracies by plotting the predicted and obtained values for the correct response category for the weight-decay and semantic-phonological models, respectively. The solid line shows where prediction is perfect and the dotted lines identify arbitrary boundaries of deviation greater than .20. In comparison to the semantic-phonological model, the weight-decay version is associated with cases in which the actual repetition score is much better than predicted. The fault lies in the weight-decay model’s inability to distinguish phonological from nonphonological lesions. Consider HN as an example (Table 1). HN’s naming is relatively poor (.35 correct) and includes more lexical errors than nonword errors. When the weight-decay model is applied to this naming pattern, the assumption that damage is global (either to weights or decay rate) forces it to postulate a severe global impairment, in this case an abnormally high decay (.885, instead of the normal .500). Saddled with a high decay parameter, the weight-decay model then predicts relatively poor repetition (.56 correct), a considerable underprediction of the obtained value (.79 correct). The semantic-phonological model, in contrast, uses parameters that distinguish between the lexical-semantic and the lexical-phonological weights. The many lexical errors that HN makes in naming cause the model to diagnose a greater impairment in the semantic parameter (.009) than in the phonological parameter (.020). When these parameters are applied to repetition, which uses only the phonological retrieval step of the model and consequently is more affected by the phonological than the semantic weights, repetition is predicted to be much better than one would expect from naming. The predicted value of .77 is very close to the obtained value of .79.
Although we can conclude that the semantic-phonological model predicts repetition better than the weight-decay model, we would like to be able to judge its quality independently of the weight-decay model. Are its predictions sufficiently accurate that it helps our understanding of lexical processing and aphasia? For most of the patients, the predictions of the semantic-phonological model are reasonably accurate. The median rmsd for this model is .037. An example of a patient with this value is BBC (see Table 1) and the match here is intuitively close. At the same time there are a few patients whose repetition is considerably better than predicted, for example, XD, FAG, BT, FT, BAT, FM, and DAN, who are identifiable in Figure 3 as points above the upper dotted line. These are clear deviations from the model’s predictions and we will return to them when we consider the dual-route version of the model.
The mean rmsd of .052 for the semantic-phonological model is, by itself, hard to interpret. One might note, for example, that the mean rmsd of the same model’s fit to the naming data from Schwartz et al. (2006) is only .024, and hence conclude that the model does a better job of explaining naming than repetition. Such a conclusion is premature for a couple of reasons. First, two kinds of measurement error oppose the accuracy of the repetition predictions, error in the naming data, which determines the parameter values that guide the predictions, and error in the repetition data itself. The naming fits are only subject to measurement error in naming. As a specific example of how error in the characterization of naming can lead to less accuracy in predicting repetition, consider the fact that the naming data used by Schwartz et al. to obtain model parameters treated naming omission errors according to an “independence model” – they are essentially treated as missing observations. Hence omission in naming does not affect the assigned parameters. It may be the case, though, that the occurrence of omission in naming is associated with repetition performance. If so, our methods will not pick this up. Second, and more important, the model’s fit to the naming data involves two free parameters. The s and p parameters are allowed to vary to make the model as close as possible to the data. The predictions for repetition, though, are absolute or zero-parameter predictions. None of the repetition data is examined in the process of making the prediction. We cannot stress enough the relative difficulty of such predictions in comparison to those in which there are free parameters. Given these considerations, we conclude that the quality of model’s match for repetition at least approaches that of the match for naming.
One way to evaluate model predictions is to compare them to a baseline model. For example, the model-data deviations can be compared to the deviations of each data point from the obtained category means, as in computations of variance accounted for. This kind of baseline model, however, uses the repetition data itself to construct the model (the category means). Here, we sought to create a baseline model that better respects the zero-parameter character of the predictions; the naming data are used to predict repetition without consultation of the repetition data. The simplest possible baseline of this sort is the naming data, itself. What if each patient’s repetition pattern is predicted to be identical to his/her naming pattern? This naming baseline model is much less accurate than either the semantic-phonological or weight-decay models, rmsd = .114.
The naming baseline, however, is not a particularly stringent test. Repetition is on average more accurate than naming and so the naming baseline will necessarily be off. If we could sneak a peek at the repetition data, we could construct a more worthy baseline, one that corrects for the average superiority of repetition, which is .22 for the 59 participants. Specifically, the augmented naming baseline model predicted the repetition response proportions to be equal to the naming response proportions, with .22 added to the correct proportion (resulting in a maximum of 1.0 if naming is better than .78), and all the error proportions adjusted so that the total of the correct and error proportions is 1.0 and the relative error proportions are exactly the same as found in naming. For example, the predicted pattern for NAC (#8 in Table 1) using the augmented naming baseline model is: correct = .96 (.74+.22), semantic = .01, formal =.01, mixed = .01, unrelated = .01, and nonword = 0. The augmented naming baseline model is fairly accurate, generating an rmsd of .067, comparable to that of the weight-decay model. The semantic-phonological model, though, is significantly better than the baseline, p < .04 by a paired t-test. So, even though the baseline model used all of the naming data and took advantage of a large peek at the repetition data to get the average increment for repetition right, it cannot match the semantic-phonological model, which uses theoretically motivated mechanisms to diagnose lexical-semantic and lexical-phonological deficits from the naming errors, and applies these differentially to the repetition task. In subsequent analyses, we therefore focus exclusively on the semantic-phonological version of the model.
One way that a model can be tested is, paradoxically, to see if it can predict when it fails and, even better, the nature of this failure. If the perfect recognition assumption is not met, the model should not accurately predict repetition because some of the errors have presumably occurred during input processing. For the six participants identified as input-processing impaired (Table 2), the semantic-phonological model’s repetition predictions were way off. The mean rmsd was .112, and each of the rmsd values was worse than the mean based on the 59 participants who were not impaired in input processing.
Moreover, the manner of the model failure with these participants should be consistent with the perfect recognition assumption being false. The model should overpredict correct performance and underpredict formal errors. Formal errors such as “cap” for “cat” are the expected errors in a word recognition task, particularly when all stimuli are words and the participants know this. If impaired input processing leads to the mishearing of a word as a similar word, repetition will include trials in which the misheard word is repeated instead of the correct one, thus substituting potential correct responses with formal errors. That is exactly what happened with the six input-impaired participants. All six of them made more formal errors in repetition than predicted (mean obtained = .200, mean predicted = .047), and the mean underprediction of .153 was significantly greater (p < .02) than the negligible underprediction (.011) present in the 59 participants who were not impaired in input processing. The extra formal errors made by the input-impaired group took away from their correctness. Correct responses were overpredicted by the model for this group by .175, and hence their poorer than expected performance was almost entirely due to the extra formal errors. The overprediction of correct responses was significantly greater for the input-impaired group than for the other participants (p < .03), which averaged a small underprediction in this category (.057, largely due to the seven individuals mentioned before whose repetition was unexpectedly high). As a further confirmation of this tradeoff between formals and corrects, we note that nonword errors, the only other common repetition- response, were neither over- nor underpredicted by the model for the input impaired participants (mean obtained = .200, mean predicted = .188). Thus, the way that the model fails to fit this group is expected from the model’s mechanisms, plus an imperfect word recognition step that delivers potential formal errors as well as correct recognitions to the phonological retrieval step.
Our final analysis of the model’s application to repetition focuses on the correlational structure of the relation between naming and repetition. The first point to make is that, on average, the better someone is at naming, the better they are in repetition. Correct naming and correct repetition correlate +.57 in the group of 59 patients. It turns out, though, that the component of naming that best predicts repetition correctness is not naming correctness, but one of the naming error categories, the nonword category. Table 5 presents the correlations between each naming response category and repetition performance. The −.74 correlation between nonwords in naming and repetition (negative because an error proportion in naming is predicting a correct repetition proportion) confirms the association between the phonological retrieval step of naming and repetition. Nonwords in naming are required by the model to be errors of phonological retrieval. The fact that they most strongly predict repetition correctness demonstrates the importance of that step in repetition. In this respect, it is noteworthy that the second strongest naming error predictor is the rate of formal errors. Formal errors, in naming, have a dual nature (Schwartz et al., 2006). Some occur during phonological retrieval and some during word retrieval. Because some of them occur during phonological retrieval, they would be expected to predict repetition better than semantic, unrelated, or mixed errors, which are word-retrieval errors.
Table 5 also shows how the obtained naming response proportions correlate with the SP model’s prediction of correct repetition. There are two noteworthy features of these correlations. First, the overall pattern of them is strikingly similar to the correlations with the actual obtained repetition. The relative sizes of the correlations and their direction is the same for predicted and obtained repetition. Of course the correlations between obtained naming and predicted repetition are larger because there is no noise in the prediction compared to the actual data. Predicted repetition is uniquely determined by the naming data. The second significant aspect of the model correlations is sheer size of that between obtained naming nonwords and predicted repetition accuracy (−.99). The variance in the predictions can be traced almost entirely to nonwords. This is because the nonword naming category is the purest estimate of the efficiency of the phonological retrieval step, which in turn is much more associated with the phonological weight parameter than the semantic weight parameter. Thus, the model, as well as the data, exhibit a strong association between errors of phonological retrieval in naming and repetition ability (provided that input processing is not impaired). More generally, the relation between naming and repetition performance exhibited by aphasic subjects is mimicked in the model.
The semantic-phonological model using a single lexical route did a fair job of accounting for word repetition. Can adding a non-lexical route to the semantic-phonological model improve its performance? It clearly did for the two patients studied by Hanley et al. (2004). The question, though, is whether patient repetition can generally by characterized by the use of two routes.
The 30 patients for whom we have nonword repetition data were used to set up dual-route simulations according to procedures described by Hanley et al. (2004). None of the 30 fell in the poor input processing group, and four of them, XD, FAG, BT, and FM, were among the seven patients whose repetition was much better than expected by the single-route version of the model. Thus, this sample should be sufficient to compare the single-and dual-route approaches.
The first modeling step was to specify the non-lexical route. Each patient’s nonword repetition data, categorized as proportions of correct responses, lexicalizations (word responses), and other responses, were used to set the strength of the non-lexical parameter of the dual-route model (nl). This entailed assigning the lexical-semantic and lexical-phonological weights to the values of s and p, respectively, as determined by the patient’s word naming. After that, a non-lexical node representing the nonword to be repeated was set up along with connections to and from its phoneme nodes. The strength of these connections is the nl parameter. To run the model in nonword repetition mode, the non-lexical node was given a jolt of 100 units of activation and this activation spread for the same number of time steps that occurs for word repetition (8). Following that, the most active phoneme nodes were determined, allowing for an assessment of whether the nonword was correctly repeated, and if not, what kind of error occurred. To determine the strength of the non-lexical route for each patient, nl was varied until it generated a nonword repetition error pattern that closely matched the patient’s proportion of correct, lexicalization, and other responses on the nonword stimuli. This turned out to be fairly easy to do by hand. The resulting fit to the nonword repetition data was good across the sample (mean rmsd = .033). The good fit is desirable, but not unexpected given that there is a free parameter (nl) and it must account for only three proportions constrained to add to one.
After the non-lexical route was specified for each patient, it was then used in tandem with the lexical route to predict word repetition. Specifically, instead of just jolting the target word node, e.g. CAT, the dual-route model also applies a jolt to a non-lexical representation of the target word with connections to and from its phonemes, e.g. /k/, /ae/ and /t/. Hence, both lexical and non-lexical sources of activation combine to generate the repetition response. Importantly, once the value of nl is set from the nonword repetition data, word repetition can be predicted without any additional free parameters. So, just like the single-route version of the model, the word-repetition error pattern is a zero-parameter prediction from the dual-route model.
Before we directly compare the single- and dual-route versions of the model, it is important to note that, logically, there is a third possibility: Words could be repeated solely through the hypothesized non-lexical route. Even though a lexical representation exists for the stimulus, it nonetheless could be repeated as if it were a nonword, just by transmitting its input representation directly to output phonology (the phoneme nodes in our model). Our study thus offers an opportunity to compare a pure non-lexical account of repetition to those that use a lexical route, either by itself or in conjunction with the non-lexical route. Specifically, a pure non-lexical account predicts that word repetition would be equal to nonword repetition, provided that the length and phonological legality of the stimuli are similar. (All of the nonwords in our test were derived from existing concrete words and were phonologically legal. They averaged 1.37 syllables in length. The words from the Philadelphia Repetition Test averaged 1.58 syllables. So, the words and nonwords were comparable.) Figure 4 plots correct word repetition as a function of correct nonword repetition for the 30 subjects. Although word and nonword repetition are clearly related, word repetition is invariably superior, often by a large margin. This finding is contrary to a pure non-lexical model of repetition and demonstrates that a lexical route is involved in some manner. The question is whether the non-lexical route contributes to the lexical route, as proposed by the dual-route model, or whether it does not.
Table 6 shows the word and nonword repetition data from the sample of 30, the fits of the dual-route semantic-phonological model to these data, and repeats the fits of the single-route version of the model from Table 1. Figures 5 and and66 show predicted and obtained correct word repetition from the single- and dual-route models, respectively. Two things are apparent from the figures. First, the single-route model does about as well as would be expected from the larger sample (Figure 3), with the four patients mentioned previously as the main deviates (see the four points near a predicted repetition of .60 and an obtained repetition of .90). Second, the dual-route model has a striking tendency to predict that repetition will be extremely good, and that the predictions are, with only a single exception, in excess of the obtained repetition. In fact, for 14 patients, the predicted proportion correct from the dual-route model was a perfect 1.0, whereas not a single patient in the sample repeated all words perfectly.
If we compare the fits quantitatively, we get a mixed picture. The mean rmsd was slightly lower (better) for the dual-route than for the single route model (.046 to .055, respectively) although this difference was not significant, p > .40. At the same time, though, the fit as measured by rmsd was better for the single-route model in 19 patients, and better for the dual-route model in only 9 patients (with 2 ties). The discrepancy between assessment using the mean rmsds of the two models and the patient counts arises because of the four cases in which the single-route model under-predicts by a large margin. The dual-route model does an excellent job of explaining these four. For example, XD’s repetition was .90 correct, with the single-route model predicting .61 and the dual-route model predicting .96. For the 26 patients other than these four, the dual-route model over-predicts, sometimes to a very large extent, while single-route model does a good job (rmsd = .039 for single-route; .050 for dual-route).
It appears that adding the non-lexical route does not generally improve the predictions of the model. It does, however, provide a potential account of why some patients’ repetition performance is unexpectedly good, such as the four (or seven from the larger sample). Thus, we conclude that, for the large majority of patients, the single-route version of the model provides a good account of word repetition. Some patients, however, (13% of the sample of 30) may use both lexical and non-lexical routes, combining their activations as specified by the dual-route model.
The interactive two-step model was able to predict word repetition from word naming when it was augmented with two assumptions. These were, first, that naming and repetition share the phonological retrieval step of lexical access during production and, second, that the word recognition component of repetition is errorless. The model could account for repetition being, on average, more accurate than naming (because naming has an additional errorful word retrieval step) and the differing distribution of error types in the two tasks (because the phonological retrieval step primarily generates nonword and formal errors). In accounting for individual variation in repetition, the model was more successful at doing so with the semantic-phonological version of the model than the weight-decay version. In addition, the structure of the correlations between the obtained naming error categories and obtained repetition was reproduced in the structure of correlations between obtained naming and the semantic-phonological model’s predictions. For both the model and the data, individual repetition performance was overwhelmingly associated with nonword errors in naming.
The model’s predictions at the individual level were not always accurate. The weight-decay version of the model frequently underpredicted repetition accuracy because it could not separate the lexical-phonological and lexical-semantic contributions to naming errors. The semantic-phonological model was much better in this respect, but it still could not account for the unexpectedly good repetition of a small number of aphasic individuals. A dual-route version of that model can account for these few patients, but does not generally improve the account for all patients.
In another small group of patients who were impaired at processing input, the single-route model predicted that repetition would be more accurate than it was and that formal errors specifically would be less common than they were. These latter deviations, though, were completely expected because the perfect recognition assumption is clearly not true for these individuals. Impaired input processing leads to formal errors in addition to correct recognitions in the word recognition stage, thus increasing formal errors and decreasing correctness in final output.
These tests of the model support a number of conclusions. First, the distinction between word and phonological retrieval receives strong support. The model’s two-step nature is largely responsible for explaining the differences in naming and repetition errors, and the extent to which naming predicts repetition. Second, the damage assumptions of the semantic-phonological model are supported, insofar as they allow for differential damage to lexical-semantic and lexical-phonological mappings. Other evidence also supports this conclusion. Using the methods reported here, we repeated the repetition model tests performed by Foygel and Dell (2000) with the 11 patients from Dell et al. (1997) who had both naming and repetition data and who did not have evidence of impaired input processing. The mean rmsds for the weight-decay and semantic-phonological models were .049 and .034, respectively. Thus, we have evidence that the superiority of the latter model is replicable. Moreover, Ruml et al. (2005) tested the repetition predictions of an Italian version of the semantic-phonological model on 50 Italian patients. Again, the mean rmsd was .034. While studies of naming alone have supported claims of differential damage across linguistic levels (e.g. Rapp & Goldrick, 2000; Schwartz et al., 2006), the consideration of naming and repetition together is definitive. The kinds of errors in naming that suggest phonological damage are the ones that are particularly predictive of word repetition. This point is not new (e.g. Caplan et al., 1986; Kohn, 1989), but placing these predictions in the context of a precise model strengthens the case. Not only do phonological errors in naming signal that there will be impaired repetition, but the quantitative properties of the naming errors predict the quantitative properties of repetition.
Another important conclusion concerns the multi-level approach to repetition that has emerged from studies of list memory. The multi-level approach proposes that repeating a short list of items depends primarily on input and output phonological representations, but also that performance is influenced by other linguistic levels, notably the lexical and semantic levels. In short, this approach emphasizes the continuity between language processing and verbal short-term memory. The interactive two-step model implements such an approach for single words. Lexical influences on repetition occur because the model assumes that the first step in the process involves word recognition and hence the lexical level plays a critical role. Semantic influences are present because of the model’s bidirectional or interactive spreading of activation. The extent to which the model is able to account for repetition then offers additional support for the multi-level approach insofar as the model concretizes the theory and shows that it is at least consistent with aphasic single-word repetition.
We wish to emphasize, though, that unlike in the naming study (Schwartz et al., 2006), there is nothing specific in our repetition data or analyses that defends the model’s interactive flow of activation. The naming data from these participants, which were presented in Schwartz et al., showed that lexical errors in naming were affected by the bidirectional spread of activation between output phonology and the word level. Because there were few errors in the repetition data that were unambiguously lexical, however, the present analyses are mute on the question of interaction. The assumption that activation flows up to semantics and back down during repetition does, though, explain other findings, notably the greater accuracy in repetition that is associated with semantically “stronger” items such as concrete words or in patients with semantic dementia, words whose semantics are not degraded (e.g. Jefferies, et al., 2004; 2005; Hanley et al., 2002; N. Martin et al., 1996). As an example, consider the influence of the strength of the semantic weights on repetition in the semantic-phonological model when phonological weights are weak. If s = .05 and p = .005, repetition accuracy is .35. If the semantic weight is then reduced to zero, accuracy drops to .15. Because most errors in the model’s (and people’s) repetition are phonological in nature, the effect of the greater semantic weight is thus, paradoxically, to prevent phonological errors. With s = 0, there are .79 formal and nonword errors, and this is reduced to .62 with s = .05. This property of the model is reminiscent of the semantic binding hypothesis (e.g. Patterson et al., 1994), in which semantic representations of words are the “glue” that keeps their phonological representations intact. Clearly, semantic representations matter even though the model is only running the phonological access step for repetition. Semantics also affects the nature of the errors committed (as in patients who occasionally make semantic errors in repetition; e.g. N. Martin & Saffran, 1992). When the semantic weight is .05 and the phonological weight .005, semantic and mixed errors, even though they are rare (.006 and .016 respectively), are actively promoted. Everything else being equal, a semantic error is 1.33 times more likely than a comparable unrelated error, and a mixed error is 1.63 times more likely than a comparable formal error. When the semantic weight is zero, semantic and mixed errors are no more likely than the comparable nonsemantic errors because there is no contribution from shared semantic features with the target. In summary, strong semantic weights lead to more accurate repetition, but can also cause semantic errors if phonological weights are low. Thus the model, through its interactive activation assumptions, is consistent with higher-level influences on repetition.
The model cannot, however, explain all of the facts about semantic errors in word repetition. In particular, the model can never generate more than 2% semantic errors in this task. A rare syndrome, deep dysphasia (there were none in our sample) can be associated with numerous semantic repetition errors (Howard & Franklin, 1988). As we mentioned before, we had previously developed an account of deep dysphasic patient NC using the framework of the interactive two-step model (N. Martin et al., 1994). The deep dysphasia model had two features that enabled it to account for the noticeable presence of semantic errors in repetition, features that are not true of the current model. First, the deep dysphasia model allowed for errors to occur in the first, or word recognition, step of repetition, as well as in the second, or phonological access, step. The current model’s perfect recognition assumption does not allow for error in the recognition step. According to Martin et al., semantic errors in repetition are largely errors in this first step (see also N. Martin, Saffran, & Dell, 1996). Second, the earlier model assumed that the deep dysphasic lesion was an increase in activation decay, rather than a reduction in connection weights. A decay lesion especially promotes semantic errors during the recognition step because the initially activated units (the phoneme and word layers) have decayed relatively more than the later activated semantic units. In essence, the model remembers the meaning of the word it must repeat better than its form, thus making semantic errors possible. If this account of deep dysphasia is correct, one would not expect the present model to be able to simulate the word repetition of such patients. In the first place, a deep dysphasic would exhibit an input-processing deficit and thus would run afoul of the perfect recognition assumption. NC, like the six input-impaired patients in our study, did poorly on auditory lexical decision, delayed phoneme discrimination, and word-picture matching (N. Martin et al., 1994). More seriously, though, allowing for decay impairment adds another lesionable dimension to the theory—a clear, but possibly necessary, complication.
The current model’s approach to repetition also contrasts with other aspects of prior treatments of repetition within our framework, specifically the shared phonological input/output assumption of N. Martin et al. (1994) and the dual-route approach of Hanley et al. (2004). In the former approach, the word recognition step of repetition uses the same phonological units and connections that are used in the phonological retrieval step of production. Thus errors in output should be associated with errors in input. Although this approach was successful in accounting for the repetition of NC (N. Martin et al., 1994; 1996), its repetition predictions were considerably less accurate than the current approach using the perfect recognition assumption in a comparison study by Dell et al. (1997) with 11 aphasic subjects. There are clear examples of patients with severe phonological output deficits whose input processing is not particularly impaired. For example, MQ (see Table 1) made more nonword errors than the typical patient in both naming (.28) and repetition (.24), and her fitted phonological weight was low, .011. Yet MQ presents no evidence of an input processing impairment. Her auditory discrimination and lexical decision performance was quite good (98% and 78% correct for discrimination after unfilled and filled intervals, respectively; 95% and 90% accuracy in lexical decision for words and nonwords, respectively). Moreover, her low rate of acceptance of phonological foils on the Philadelphia Naming Verification Test (.01), which is a direct measure of the probability of mishearing a word, was within the normal range. It is cases like MQ that suggest that the shared phonological input/output approach cannot apply generally. We acknowledge, though, that the alternative that we advocate—separate input and output phonological units and connections—leaves unexplained the demonstrable correlations between measures of input and output phonological processing; on average, patients with good input processing have better output (e.g. N. Martin & Saffran, 1997). We can see this association in our data as well. The input-impaired group of six averaged more nonword errors in naming (.19) than the remaining 59 patients did (.13). In the face of such evidence, some have coupled the assumption of a common phonological level with the idea that top-down comprehension processes are available to compensate for phonological deficits on the input side (N. Martin & Saffran, 2000).
What about the other approach to repetition, the dual-route approach? The dual-route approach as implemented by Hanley et al. (2004) accepts the separation of input and output phonology and makes the perfect recognition assumption in its lexical route. The non-lexical route from input to output phonology, however, contributes activation as well. As a result, word repetition benefits from both lexical and non-lexical activation sources (Hillis & Caramazza, 1991). Our implementation of a dual-route version of the semantic-phonological model using 30 of our sample did not generally support it over the single-route version of the model. The dual-route model was, however, quite successful at explaining the four patients in the sub-sample whose repetition was unexpectedly good, leading us to conclude that this model may be appropriate for some individuals. Although this conclusion is consistent with the modeling, it is somewhat unsatisfying. Why are these few patients, and not the others, using both routes? Can we develop an independent assessment that predicts when both routes are used? For example, the four patients that were clearly fit well by the dual-route model all have relatively low semantic weights (XD = .002; FAG = .020; BT = .005; FM = .021, where the mean and median of the sample are .030 and .026, respectively). Perhaps poor lexical semantic processing induces the use of the non-lexical route to compensate. So, although the non-lexical route is hypothesized to be used in repetition only when the lexical route fails to retrieve a word (i.e. when the stimulus is a nonword), perhaps it comes to be used habitually, for words as well as nonwords, when there are frequent failures in tasks involving word-level representations (e.g. speaking, listening, reading, or writing of meaningful material). It is clear from the literature that we have reviewed in this article, that semantics contributes a great deal to the access and retention of words in all of these modalities. If the semantic contribution is absent or degraded, as arguably happens when there are low semantic weights, aphasic individuals may use other information, such as the non-lexical route, if it is functional and relevant. This speculation is supported by the fact that the two patients from Hanley et al. (2004) who were well fit by the dual-route model had low semantic weights (PS = .020; MF = .015). Moreover, so did the three patients in our study whose repetition was unexpectedly good but were not evaluated by the dual-route model (FT = .015; BAT = .005; DAN = .011). These three would very likely be much better fit by the dual-route model if we had nonword repetition data for them to base the model on. In any event, there is a need for more fine-grained assessment of cases of which repetition is unexpectedly good. At this point in time, we feel that the dual-route model is as good an explanation as any for these discrepancies.
It is important to note an alternative to an architecturally stipulated dual route approach to repetition. Perhaps the lexical and non-lexical routes could emerge from a single route from input to output phonology via an intermediate level of hidden units whose distributed representations of the mapping are set by a learning process (e.g. Seidenberg & McClelland, 1989). Similarly, one can also ask whether the explicit two steps hypothesized for lexical access from meaning can emerge in a model that uses learning algorithms to set connection weights. In fact, Plaut and Kello (1999) have implemented such a model, and Lambon Ralph and colleagues have invoked its properties (without simulation) as a framework for explaining naming and repetition impairments in semantic dementia and aphasia (e.g., Jefferies et al., 2005; 2006; Lambon Ralph, Cipolotti & Patterson, 1999; Lambon Ralph, Sage & Roberts, 2000). These learning-based models emphasize the systematicity or lack thereof in the mappings between representations. For example, going from semantics to phonology in the naming task is not systematic because words with similar meanings do not tend to have similar pronunciations (e.g. Dell et al., 1997; Plaut & Kello, 1999). Word repetition and oral reading tasks, in contrast, are systematic, in that the input structure is predictive of the output. When a mapping is not systematic, like naming, a layer of non-linear hidden units is required to mediate it, and the mapping is hard to learn and vulnerable to damage (Lambon Ralph, Sage, & Roberts, 2000; Lambon Ralph, Moriarty, & Sage, 2002). Although our model of the relation between naming and repetition is not based on learning, it does reflect insights regarding systematicity that these approaches emphasize. Naming in our view entails the active selection of mediating lexical units during the first step. These are like hidden units, and the active selection introduces the necessary nonlinearity. Repetition, in contrast, does not depend heavily on access from meaning and hence does not suffer from the error prone first step that makes naming nonsystematic and difficult. At the same time, though, our approach differs from the learning models because in the latter models, the mediating representations, which we stipulate to be lexical in nature and to be associated with grammatical information (e.g. Schwartz et al., 2006), are emergent from other more primary representations rather than being stipulated as lexical.
The discussion of the cases in which repetition was much better than predicted raises the general question of how one draws theoretical conclusions when a model does not account for all of the data. We have discussed this in the companion paper (Schwartz et al., 2006) and the issue has been raised in the context of earlier versions of the model (Dell et al., 2000; Rapp & Goldrick, 2000; Ruml & Caramazza, 2000; Ruml et al., 2005). Our view can be summarized by a quote from the statistician, Box (1979), “All models are wrong, but some are useful.” We do not claim that the semantic-phonological version of the interactive-two-step model augmented with the perfect recognition and single-route assumptions is right. It demonstrably is not for the input-impaired patients and for those patients whose repetition is much better than expected. We do contend that it is useful, though. Models serve as mediators between data and theory. In this case, the model makes explicit the mechanisms behind the two-step theory of lexical access in production, multi-level approaches to repetition and short-term memory, and lexical-route accounts of the relation between naming and repetition. It allows us to see that the data are supportive of these theoretical ideas by showing how the general differences between naming and repetition might arise from these ideas, and that much of the patient variation in these skills can be attributed to variation in model components. Where the model fails to predict the repetition patterns of the patients, the model itself sometimes makes it easy to understand the failures, as we saw with the input-impaired group. The mismatch between data and model predictions can be directly attributed to a model assumption, in this case the perfect recognition assumption. Where the model’s failure is unexpected, as in the seven cases just described, it stimulates a search for alternative mechanisms, such as the hypothesized non-lexical contribution to repetition. Ultimately, though, we hope to say in the future that the model was useful because it led to a better one.
To conclude, we revisit a point made in Schwartz et al. (2006) regarding the relation between these kinds of computational models and traditional neuropsychological “box-and-arrow” models (see also, Coltheart, 2004; Dell, 2004). These should not be seen as competing frameworks. The computational method should instead be considered an extension of the traditional approach. The model’s parameters quantify the level of damage to components, thus allowing for more precise testing of hypothesized components and their contributions to neuropsychological tasks. This is particularly true of the application of the model to repetition. Naming is hypothesized to involve two sets of connections, one that is wholly shared with repetition (lexical-phonological connections) and one that contributes only indirectly to repetition (lexical-semantic connections). Quantifying the level of damage to each in an individual then leads to a quantitative prediction about repetition. The model’s computational mechanisms are required because the result of the combination of the damaged components is not always obvious. The underlying logic, though, is the same as that in the traditional models. Whether one is talking about boxes and arrows, or units and connections, the goal is to discover the parts of the system and their influences upon one another by relating hypothesized impairments to system parts, and relating parts to performance on multiple tasks. This is what we have tried to do here.
This project is supported by a grant from the NIH: R01 DC00191 (M.F. Schwartz). The authors are grateful to all who participated in the study and to the speech-language pathologists of the Center for Communication Disorders of the Moss Rehabilitation Research Institute and other Philadelphia-area facilities who referred these individuals to us. We acknowledge with thanks the important contributions of Paula Sobel and Adelyn Brecher, which include patient testing, scoring, and data management and Judy Allen for work on the manuscript. We are also grateful for valuable comments from Matt Lambon Ralph, Merrill Garrett, and an anonymous reviewer.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Gary S. Dell, University of Illinois, Urbana-Champaign.
Nadine Martin, Temple University.
Myrna F. Schwartz, Moss Rehabilitation Research Institute, Albert Einstein Healthcare Network.