Search tips
Search criteria

Results 1-25 (12402)

Clipboard (0)

Related Articles

1.  Exchange of Stuttering From Function Words to Content Words With Age 
Dysfluencies on function words in the speech of people who stutter mainly occur when function words precede, rather than follow, content words (Au-Yeung, Howell, & Pilgrim, 1998). It is hypothesized that such function word dysfluencies occur when the plan for the subsequent content word is not ready for execution. Repetition and hesitation on the function words buys time to complete the plan for the content word. Stuttering arises when speakers abandon the use of this delaying strategy and carry on, attempting production of the subsequent, partly prepared content word. To test these hypotheses, the relationship between dysfluency on function and content words was investigated in the spontaneous speech of 51 people who stutter and 68 people who do not stutter. These participants were subdivided into the following age groups: 2–6-year-olds, 7–9-year-olds, 10–12-year-olds, teenagers (13–18 years), and adults (20–40 years). Very few dysfluencies occurred for either fluency group on function words that occupied a position after a content word. For both fluency groups, dysfluency within each phonological word occurred predominantly on either the function word preceding the content word or on the content word itself, but not both. Fluent speakers had a higher percentage of dysfluency on initial function words than content words. Whether dysfluency occurred on initial function words or content words changed over age groups for speakers who stutter. For the 2–6-year-old speakers that stutter, there was a higher percentage of dysfluencies on initial function words than content words. In subsequent age groups, dysfluency decreased on function words and increased on content words. These data are interpreted as suggesting that fluent speakers use repetition of function words to delay production of the subsequent content words, whereas people who stutter carry on and attempt a content word on the basis of an incomplete plan.
PMCID: PMC2013932  PMID: 10229451
stuttering; phonological words; function words; content words; speech plan
2.  Detecting Changes in Chief Complaint Word Count: Effects on Syndromic Surveillance 
To identify changes in emergency department (ED) syndromic surveillance data by analyzing trends in chief complaint (CC) word count; to compare these changes to coding changes reported by EDs; and to examine how these changes might affect the ability of syndromic surveillance systems to identify syndromes in a consistent manner.
The New York City (NYC) Department of Health and Mental Hygiene (DOHMH) receives daily ED data from 49 of NYC’s 52 hospitals, representing approximately 95% of ED visits citywide. Chief complaint (CC) is categorized into syndrome groupings using text recognition of symptom key-words and phrases. Hospitals are not required to notify the DOHMH of any changes to procedures or health information systems (HIS). Previous work noticed that CC word count varied over time within and among EDs. The variations seen in CC word count may affect the quality and type of data received by the DOHMH, thereby affecting the ability to detect syndrome visits consistently.
The daily mean number of words in the chief complaint field were examined by hospital from 2008–2011. Spectral analyses were performed on daily CC word count by hospital to explore temporal trends. Change Point Analysis (CPA) using Taylor’s Method with a maximum change level of four was conducted on the CC field by hospital using 1,000 bootstrap samples. According to Taylor, a level one change is the most important change detected on the program’s first pass through the data. For this analysis, a change point was considered significant if it was level 1, detected an average change of more than 0.50 words per day, and was sustained for at least 6 months before a level 2 change of at least 0.50 words occurred. Results of the CPA were compared to reported changes identified by a survey conducted by DOHMH staff of 49 hospitals that collected information about their HIS and coding practices, including any recent system changes.
When a significant level one change was identified, time series graphs for six months before and after the change were created for five syndromes (cold, diarrhea, fever-flu, influenza-like-illness, and respiratory) and the syndrome’s constituent symptom categories (e.g. cough fever, etc.). Changes in syndrome count and composition at the level one change in word count were noted.
The mean chief complaint word count across all hospitals from 2008 – 2011 in NYC was 3.14, with a range of 0 to 18 words. CPA detected a significant level 1 change in 21 hospitals, with a mean change of 0.60 words, with 9 increases (mean= 0.71 words) and 12 decreases (mean= 0.53 words). According to the results of a survey of 49 NYC EDs, 19 have changed coding practices or Health Information Systems since 2008. CPA identified a coincident and significant shift in word count for 8 of these hospitals. CPA also detected significant shifts in word count for 13 hospitals that did not report any changes. Figure 1 shows the results of CPA from one ED in NYC
We observed immediate changes in daily syndrome count after the detected change in CC word count. For example, respiratory syndrome count increased with increased word count and decreased with decreased word count for 10 of the 21 EDs with a significant change in word count. Only 2 EDs saw an opposite effect on respiratory syndrome count. Meanwhile, 9 EDs saw no obvious change in respiratory syndrome count. Furthermore, these changes in daily CC word count coincided with subsequent changes in syndrome composition, the breakdown of syndromes into constituent symptoms.
Change Point Analysis may be a useful method for prospectively detecting shifts in CC word count, which might represent a change in ED practices. In some instances changes to CC word count had an apparent effect on both syndrome capture and syndrome composition. Further studies are required to determine how often these changes happen and how they may affect the quality of syndromic surveillance.
PMCID: PMC3692864
Chief Complaint; Word Count; Change Point Analysis
3.  The Mechanism of Word Crowding 
Vision research  2011;52(1):61-69.
Word reading speed in peripheral vision is slower when words are in close proximity of other words (Chung, 2004). This word crowding effect could arise as a consequence of interaction of low-level letter features between words, or the interaction between high-level holistic representations of words. We evaluated these two hypotheses by examining how word crowding changes for five configurations of flanking words: the control condition — flanking words were oriented upright; scrambled — letters in each flanking word were scrambled in order; horizontal-flip — each flanking word was the left-right mirror-image of the original; letter-flip — each letter of the flanking word was the left-right mirror-image of the original; and vertical-flip — each flanking word was the up-down mirror-image of the original. The low-level letter feature interaction hypothesis predicts similar word crowding effect for all the different flanker configurations, while the high-level holistic representation hypothesis predicts less word crowding effect for all the alternative flanker conditions, compared with the control condition. We found that oral reading speed for words flanked above and below by other words, measured at 10° eccentricity in the nasal field, showed the same dependence on the vertical separation between the target and its flanking words, for the various flanker configurations. The result was also similar when we rotated the flanking words by 90° to disrupt the periodic vertical pattern, which presumably is the main structure in words. The remarkably similar word crowding effect irrespective of the flanker configurations suggests that word crowding arises as a consequence of interactions of low-level letter features.
PMCID: PMC3246086  PMID: 22079315
crowding; word recognition; peripheral vision; features; holistic representation
4.  To mind the mind: An event-related potential study of word class and semantic ambiguity 
Brain research  2006;1081(1):191-202.
The goal of this study was to jointly examine the effects of word class, word class ambiguity, and semantic ambiguity on the brain response to words in syntactically specified contexts. Four types of words were used: (1) word class ambiguous words with a high degree of semantic ambiguity (e.g., ‘duck’); (2) word class ambiguous words with little or no semantic ambiguity (e.g., ‘vote’); (3) word class unambiguous nouns (e.g., ‘sofa’); and (4) word class unambiguous verbs (e.g., ‘eat’). These words were embedded in minimal phrases that explicitly specified their word class: “the” for nouns (and ambiguous words used as nouns) and “to” for verbs (and ambiguous words used as verbs). Our results replicate the basic word class effects found in prior work (Federmeier, K.D., Segal, J.B., Lombrozo, T., Kutas, M., 2000. Brain responses to nouns, verbs and class ambiguous words in context. Brain, 123 (12), 2552–2566), including an enhanced N400 (250–450ms) to nouns compared with verbs and an enhanced frontal positivity (300–700 ms) to unambiguous verbs in relation to unambiguous nouns. A sustained frontal negativity (250–900 ms) that was previously linked to word class ambiguity also appeared in this study but was specific to word class ambiguous items that also had a high level of semantic ambiguity; word class ambiguous items without semantic ambiguity, in contrast, were more positive than class unambiguous words in the early part of this time window (250–500 ms). Thus, this frontal negative effect seems to be driven by the need to resolve the semantic ambiguity that is sometimes associated with different grammatical uses of a word class ambiguous homograph rather than by the class ambiguity per se.
PMCID: PMC2728580  PMID: 16516169
Language; Word class; Word class ambiguity; Noun–verb homonymy; ERP
5.  The Evolution of Word Composition in Metazoan Promoter Sequence 
PLoS Computational Biology  2006;2(11):e150.
The field of molecular evolution provides many examples of the principle that molecular differences between species contain information about evolutionary history. One surprising case can be found in the frequency of short words in DNA: more closely related species have more similar word compositions. Interest in this has often focused on its utility in deducing phylogenetic relationships. However, it is also of interest because of the opportunity it provides for studying the evolution of genome function. Word-frequency differences between species change too slowly to be purely the result of random mutational drift. Rather, their slow pattern of change reflects the direct or indirect action of purifying selection and the presence of functional constraints. Many such constraints are likely to exist, and an important challenge is to distinguish them. Here we develop a method to do so by isolating the effects acting at different word sizes. We apply our method to 2-, 4-, and 8-base-pair (bp) words across several classes of noncoding sequence. Our major result is that similarities in 8-bp word frequencies scale with evolutionary time for regions immediately upstream of genes. This association is present although weaker in intronic sequence, but cannot be detected in intergenic sequence using our method. In contrast, 2-bp and 4-bp word frequencies scale with time in all classes of noncoding sequence. These results suggest that different genomic processes are involved at different word sizes. The pattern in 2-bp and 4-bp words may be due to evolutionary changes in processes such as DNA replication and repair, as has been suggested before. The pattern in 8-bp words may reflect evolutionary changes in gene-regulatory machinery, such as changes in the frequencies of transcription-factor binding sites, or in the affinity of transcription factors for particular sequences.
One of the foundations of molecular evolution is the idea that more closely related species are more similar on the molecular level. One example that has been known for several years is the genomic composition of short words (i.e., short segments) of DNA. Given a sample of genome sequence, one can count the occurrences of all words of a certain length. It turns out that closely related species have more similar word frequencies. The pattern of how these frequencies change over evolutionary time is likely to be influenced by the many functions of the genome (coding for proteins, controlling gene expression, etc.). Bush and Lahn investigated the influence of genomic function on word-frequency variation in 13 animal genomes. Using a method designed to isolate the effects acting at particular word sizes, the authors examined how word frequencies vary in different categories of noncoding sequence. They found that interspecies patterns of word-frequency variation change depending on word size and sequence category. These results suggest that noncoding sequence is subject to different functional constraints depending on its location in the genome. An especially interesting possibility is that the patterns in longer words may reflect evolutionary changes in gene regulatory machinery.
PMCID: PMC1630712  PMID: 17083273
6.  Stuttering on function and content words across age groups of German speakers who stutter 
Recent research into stuttering in English has shown that function word disfluency decreases with age whereas content words disfluency increases. Also function words that precede a content word are significantly more likely to be stuttered than those that follow content words (Au-Yeung, Howell and Pilgrim, 1998; Howell, Au-Yeung and Sackin, 1999). These studies have used the concept of the phonological word as a means of investigating these phenomena. Phonological words help to determine the position of function words relative to content words and to establish the origin of the patterns of disfluency with respect to these two word classes. The current investigation analysed German speech for similar patterns. German contains many long compound nouns; on this basis, German content words are more complex than English ones. Thus, the patterns of disfluency within phonological words may differ between German and English. Results indicated three main findings. Function words that occupy an early position in a PW have higher rates of disfluency than those that occur later in a PW, this being most apparent for the youngest speakers. Second, function words that precede the content word in a PW have higher rates of disfluency than those that follow the content word. Third, young speakers exhibit high rates of disfluency on function words, but this drops off with age and, correspondingly, disfluency rate on content words increases. The patterns within phonological words may be general to German and English and can be accounted for by the EXPLAN model, assuming lexical class operates equivalently across these languages or that lexical categories contain some common characteristic that is associated with fluency across the languages.
PMCID: PMC2239212  PMID: 18270544
Stuttering; German; function and content words
7.  Recognizing Spoken Words: The Neighborhood Activation Model 
Ear and hearing  1998;19(1):1-36.
A fundamental problem in the study of human spoken word recognition concerns the structural relations among the sound patterns of words in memory and the effects these relations have on spoken word recognition. In the present investigation, computational and experimental methods were employed to address a number of fundamental issues related to the representation and structural organization of spoken words in the mental lexicon and to lay the groundwork for a model of spoken word recognition.
Using a computerized lexicon consisting of transcriptions of 20,000 words, similarity neighborhoods for each of the transcriptions were computed. Among the variables of interest in the computation of the similarity neighborhoods were: 1) the number of words occurring in a neighborhood, 2) the degree of phonetic similarity among the words, and 3) the frequencies of occurrence of the words in the language. The effects of these variables on auditory word recognition were examined in a series of behavioral experiments employing three experimental paradigms: perceptual identification of words in noise, auditory lexical decision, and auditory word naming.
The results of each of these experiments demonstrated that the number and nature of words in a similarity neighborhood affect the speed and accuracy of word recognition. A neighborhood probability rule was developed that adequately predicted identification performance. This rule, based on Luce's (1959) choice rule, combines stimulus word intelligibility, neighborhood confusability, and frequency into a single expression. Based on this rule, a model of auditory word recognition, the neighborhood activation model, was proposed. This model describes the effects of similarity neighborhood structure on the process of discriminating among the acoustic-phonetic representations of words in memory. The results of these experiments have important implications for current conceptions of auditory word recognition in normal and hearing impaired populations of children and adults.
PMCID: PMC3467695  PMID: 9504270
8.  Lexical Effects on Spoken Word Recognition by Pediatric Cochlear Implant Users 
Ear and hearing  1995;16(5):470-481.
The purposes of this study were 1) to examine the effect of lexical characteristics on the spoken word recognition performance of children who use a multichannel cochlear implant (CI), and 2) to compare their performance on lexically controlled word lists with their performance on a traditional test of word recognition, the PB-K.
In two different experiments, 14 to 19 pediatric CI users who demonstrated at least some open-set speech recognition served as subjects. Based on computational analyses, word lists were constructed to allow systematic examination of the effects of word frequency, lexical density (i.e., the number of phonemically similar words, or neighbors), and word length. The subjects’ performance on these new tests and the PB-K also was compared.
The percentage of words correctly identified was significantly higher for lexically “easy” words (high frequency words with few neighbors) than for “hard” words (low frequency words with many neighbors), but there was no lexical effect on phoneme recognition scores. Word recognition performance was consistently higher on the lexically controlled lists than on the PB-K. In addition, word recognition was better for multisyllabic than for monosyllabic stimuli.
These results demonstrate that pediatric cochlear implant users are sensitive to the acoustic-phonetic similarities among words, that they organize words into similarity neighborhoods in long-term memory, and that they use this structural information in recognizing isolated words. The results further suggest that the PB-K underestimates these subjects’ spoken word recognition.
PMCID: PMC3495322  PMID: 8654902
9.  The word landscape of the non-coding segments of the Arabidopsis thaliana genome 
BMC Genomics  2009;10:463.
Genome sequences can be conceptualized as arrangements of motifs or words. The frequencies and positional distributions of these words within particular non-coding genomic segments provide important insights into how the words function in processes such as mRNA stability and regulation of gene expression.
Using an enumerative word discovery approach, we investigated the frequencies and positional distributions of all 65,536 different 8-letter words in the genome of Arabidopsis thaliana. Focusing on promoter regions, introns, and 3' and 5' untranslated regions (3'UTRs and 5'UTRs), we compared word frequencies in these segments to genome-wide frequencies. The statistically interesting words in each segment were clustered with similar words to generate motif logos. We investigated whether words were clustered at particular locations or were distributed randomly within each genomic segment, and we classified the words using gene expression information from public repositories. Finally, we investigated whether particular sets of words appeared together more frequently than others.
Our studies provide a detailed view of the word composition of several segments of the non-coding portion of the Arabidopsis genome. Each segment contains a unique word-based signature. The respective signatures consist of the sets of enriched words, 'unwords', and word pairs within a segment, as well as the preferential locations and functional classifications for the signature words. Additionally, the positional distributions of enriched words within the segments highlight possible functional elements, and the co-associations of words in promoter regions likely represent the formation of higher order regulatory modules. This work is an important step toward fully cataloguing the functional elements of the Arabidopsis genome.
PMCID: PMC2770528  PMID: 19814816
10.  Comparison of two ways of defining phonological words for assessing stuttering pattern changes with age in Spanish speakers who stutter 
Phonological words (PWs) are defined as having a single word that acts as a nucleus and an optional number of function words preceding and following that act as satellites. Content and function words are one way of specifying the nucleus and satellites of PW. PW, defined in this way, have been found useful in the characterization of patterns of disfluency over ages for both English and Spanish speakers who stutter. Since content words carry stress in English, PWs segmented using content words as the nucleus would correspond to a large extent with PWs segmented that use a stressed word as the nucleus. This correlation between word type and stress does not apply to the same extent in Spanish. Samples of Spanish from speakers of different ages were segmented into PWs using a stressed, rather than a content, word as the nucleus and unstressed, rather than function, words as satellites. PWs were partitioned into those that were common to the two segmentation methods (common set) and those that differed (different set). There were two separate segmentations when PWs differed, those appropriate to content word nuclei, and those appropriate to stressed word nuclei. The two types of segmentation on the different set were analyzed separately to see whether one, both or neither method led to similar patterns of disfluency to those reported when content words were used as nuclei in English and Spanish. Generally speaking, the patterns of stuttering in PW found in English applied to all three analyses (common and the two on the different set) in Spanish. Thus, neither segmentation method showed a marked superiority in predicting the patterns of disfluency over age groups for the different set of Spanish data. It is argued that stressed or content word status can lead to a word being a nucleus and that there may be other factors (e.g. speech rate) that underlie stressed words and content words that affect the words around these PW nuclei in a similar way.
PMCID: PMC2239249  PMID: 18270547
Development stuttering; Spanish; metrical influences on disfluency; lexical influences on disfluency; EXPLAN theory
11.  Phonological Words and Stuttering on Function Words 
Stuttering on function words was examined in 51 people who stutter. The people who stutter were subdivided into young (2 to 6 years), middle (6 to 9 years), and older (9 to 12 years) child groups; teenagers (13 to 18 years); and adults (20 to 40 years). As reported by previous researchers, children up to about age 9 stuttered more on function words (pronouns, articles, prepositions, conjunctions, auxiliary verbs), whereas older people tended to stutter more on content words (nouns, main verbs, adverbs, adjectives). Function words in early positions in utterances, again as reported elsewhere, were more likely to be stuttered than function words at later positions in an utterance. This was most apparent for the younger groups of speakers. For the remaining analyses, utterances were segmented into phonological words on the basis of Selkirk’s work (1984). Stuttering rate was higher when function words occurred in early phonological word positions than other phonological word positions whether the phonological word appeared in initial position in an utterance or not. Stuttering rate was highly dependent on whether the function word occurred before or after the single content word allowed in Selkirk’s (1984) phonological words. This applied, once again, whether the phonological word was utterance-initial or not. It is argued that stuttering of function words before their content word in phonological words in young speakers is used as a delaying tactic when the forthcoming content word is not prepared for articulation.
PMCID: PMC2013931  PMID: 9771625
stuttering; phonological words; function words; speech plan
12.  Communicative value of self cues in aphasia: A re-evaluation 
Aphasiology  2006;20(7):684-704.
Adults with aphasia often try mightily to produce specific words, but their word-finding attempts are frequently unsuccessful. However, the word retrieval process may contain rich information that communicates a desired message regardless of word-finding success.
The original article reprinted here reports an investigation that assessed whether patient-generated self cues inherent in the word retrieval process could be interpreted by listener/observers and improve on communicative effectiveness for adults with aphasia. The newly added commentary identifies and reports tentative conclusions from 18 investigations of self-generated cues in aphasia since the 1982 paper. It further provides a rationale for increasing research on self-generated cueing and notes a surprising lack of attention to the questions investigated in the original article. The original research is also connected with more recent qualitative investigations of interactional, as opposed to transactional, communicative exchange.
Methods & Procedures
While performing single-word production tasks, 10 adults with aphasia produced 107 utterances that contained spontaneous word retrieval behaviours. To determine the “communicative value” of these behaviours, herein designated self cues or self-generated cues, the utterance-final (potential target) word was edited out and the edited utterances were dubbed onto a videotape. Six naïve observers, three of whom received some context about the nature of word retrieval in aphasia and possible topics for the utterances, and three of whom got no information, predicted the target word of each utterance from the word-finding behaviours alone. The communicative value of the self-generated cues was determined for each individual with aphasia by summing percent correct word retrieval and percent correct observer prediction of target words, based on word retrieval behaviours. The newly added commentary describes some challenges of investigating a “communicative value” outcome, and indicates what would and would not change about the methods, if we did the study today.
Outcomes & Results
The observer group that was given some context information appeared to be more successful at predicting target words than the group without any such information. Self-generated cues enhanced communication for the majority of individuals with aphasia, with some cues (e.g., descriptions/gestures of action or function) appearing to carry more communicative value than others (e.g., semantic associates). The commentary again indicates how and why we would change this portion of the investigation if conducting the study at this time.
The results are consistent with Holland’s (1977) premise that people with aphasia do well at communication, regardless of the words they produce. The finding that minimal context information may assist observers in understanding the communicative intent of people with aphasia has important implications for training family members to interpret self-generated cues. The new commentary reinforces these conclusions, highlights potential differences between self cues that improve word-finding success and those that enhance message transmission, and points to some additional research needs.
PMCID: PMC2808031  PMID: 20090926
13.  Niche as a Determinant of Word Fate in Online Groups 
PLoS ONE  2011;6(5):e19009.
Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between their intrinsic properties and the environments in which they function. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.
PMCID: PMC3093376  PMID: 21589910
14.  Effects of degraded sensory input on memory for speech: Behavioral data and a test of biologically constrained computational models 
Brain research  2010;1365:48-65.
Poor hearing acuity reduces memory for spoken words, even when the words are presented with enough clarity for correct recognition. An "effortful hypothesis" suggests that the perceptual effort needed for recognition draws from resources that would otherwise be available for encoding the word in memory. To assess this hypothesis, we conducted a behavioral task requiring immediate free recall of word-lists, some of which contained an acoustically masked word that was just above perceptual threshold. Results show that masking a word reduces the recall of that word and words prior to it, as well as weakening the linking associations between the masked and prior words. In contrast, recall probabilities of words following the masked word are not affected. To account for this effect we conducted computational simulations testing two classes of models: associative linking models and short-term memory buffer models. Only a model that integrated both contextual linking and buffer components matched all of the effects of masking observed in our behavioral data. In this Linking-Buffer model, the masked word disrupts a short-term memory buffer, causing associative links of words in the buffer to be weakened, affecting memory for the masked word and the word prior to it, while allowing links of words following the masked word to be spared. We suggest that these data account for the so-called "effortful hypothesis", where distorted input has a detrimental impact on prior information stored in short-term memory.
PMCID: PMC2993831  PMID: 20875801
modeling; simulations; recall; word lists; associations
15.  Interaction of Knowledge Sources in Spoken Word Identification 
Journal of memory and language  1985;24(2):210-231.
A gating technique was used in two studies of spoken word identification that investigated the relationship between the available acoustic–phonetic information in the speech signal and the context provided by meaningful and semantically anomalous sentences. The duration of intact spoken segments of target words and the location of these segments at the beginnings or endings of words in sentences were varied. The amount of signal duration required for word identification and the distribution of incorrect word responses were examined. Subjects were able to identify words in spoken sentences with only word-initial or only word-final acoustic–phonetic information. In meaningful sentences, less word-initial information was required to identify words than word-final information. Error analyses indicated that both acoustic–phonetic information and syntactic contextual knowledge interacted to generate the set of hypothesized word candidates used in identification. The results provide evidence that word identification is qualitatively different in meaningful sentences than in anomalous sentences or when words are presented in isolation: That is, word identification in sentences is an interactive process that makes use of several knowledge sources. In the presence of normal sentence context, the acoustic–phonetic information in the beginnings of words is particularly effective in facilitating rapid identification of words.
PMCID: PMC3513696  PMID: 23226691
16.  Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading? 
PLoS ONE  2013;8(2):e55440.
We conducted a preliminary study to examine whether Chinese readers’ spontaneous word segmentation processing is consistent with the national standard rules of word segmentation based on the Contemporary Chinese language word segmentation specification for information processing (CCLWSSIP). Participants were asked to segment Chinese sentences into individual words according to their prior knowledge of words. The results showed that Chinese readers did not follow the segmentation rules of the CCLWSSIP, and their word segmentation processing was influenced by the syntactic categories of consecutive words. In many cases, the participants did not consider the auxiliary words, adverbs, adjectives, nouns, verbs, numerals and quantifiers as single word units. Generally, Chinese readers tended to combine function words with content words to form single word units, indicating they were inclined to chunk single words into large information units during word segmentation. Additionally, the “overextension of monosyllable words” hypothesis was tested and it might need to be corrected to some degree, implying that word length have an implicit influence on Chinese readers’ segmentation processing. Implications of these results for models of word recognition and eye movement control are discussed.
PMCID: PMC3568123  PMID: 23408981
17.  Dysfunctional visual word form processing in progressive alexia 
Brain  2013;136(4):1260-1273.
Progressive alexia is an acquired reading deficit caused by degeneration of brain regions that are essential for written word processing. Functional imaging studies have shown that early processing of the visual word form depends on a hierarchical posterior-to-anterior processing stream in occipito-temporal cortex, whereby successive areas code increasingly larger and more complex perceptual attributes of the letter string. A region located in the left lateral occipito-temporal sulcus and adjacent fusiform gyrus shows maximal selectivity for words and has been dubbed the ‘visual word form area’. We studied two patients with progressive alexia in order to determine whether their reading deficits were associated with structural and/or functional abnormalities in this visual word form system. Voxel-based morphometry showed left-lateralized occipito-temporal atrophy in both patients, very mild in one, but moderate to severe in the other. The two patients, along with 10 control subjects, were scanned with functional magnetic resonance imaging as they viewed rapidly presented words, false font strings, or a fixation crosshair. This paradigm was optimized to reliably map brain regions involved in orthographic processing in individual subjects. All 10 control subjects showed a posterior-to-anterior gradient of selectivity for words, and all 10 showed a functionally defined visual word form area in the left hemisphere that was activated for words relative to false font strings. In contrast, neither of the two patients with progressive alexia showed any evidence for a selectivity gradient or for word-specific activation of the visual word form area. The patient with mild atrophy showed normal responses to both words and false font strings in the posterior part of the visual word form system, but a failure to develop selectivity for words in the more anterior part of the system. In contrast, the patient with moderate to severe atrophy showed minimal activation of any part of the visual word form system for either words or false font strings. Our results suggest that progressive alexia is associated with a dysfunctional visual word form system, with or without substantial cortical atrophy. Furthermore, these findings demonstrate that functional MRI has the potential to reveal the neural bases of cognitive deficits in neurodegenerative patients at very early stages, in some cases before the development of extensive atrophy.
PMCID: PMC3613714  PMID: 23471694
progressive alexia; letter-by-letter reading; posterior cortical atrophy; logopenic primary progressive aphasia; visual word form system
18.  Orthographic familiarity, phonological legality and number of orthographic neighbours affect the onset of ERP lexical effects 
It has been suggested that the variability among studies in the onset of lexical effects may be due to a series of methodological differences. In this study we investigated the role of orthographic familiarity, phonological legality and number of orthographic neighbours of words in determining the onset of word/non-word discriminative responses.
ERPs were recorded from 128 sites in 16 Italian University students engaged in a lexical decision task. Stimuli were 100 words, 100 quasi-words (obtained by the replacement of a single letter), 100 pseudo-words (non-derived) and 100 illegal letter strings. All stimuli were balanced for length; words and quasi-words were also balanced for frequency of use, domain of semantic category and imageability. SwLORETA source reconstruction was performed on ERP difference waves of interest.
Overall, the data provided evidence that the latency of lexical effects (word/non-word discrimination) varied as a function of the number of a word's orthographic neighbours, being shorter to non-derived than to derived pseudo-words. This suggests some caveats about the use in lexical decision paradigms of quasi-words obtained by transposing or replacing only 1 or 2 letters. Our findings also showed that the left-occipito/temporal area, reflecting the activity of the left fusiform gyrus (BA37) of the temporal lobe, was affected by the visual familiarity of words, thus explaining its lexical sensitivity (word vs. non-word discrimination). The temporo-parietal area was markedly sensitive to phonological legality exhibiting a clear-cut discriminative response between illegal and legal strings as early as 250 ms of latency.
The onset of lexical effects in a lexical decision paradigm depends on a series of factors, including orthographic familiarity, degree of global lexical activity, and phonologic legality of non-words.
PMCID: PMC2491646  PMID: 18601726
19.  On finding minimal absent words 
BMC Bioinformatics  2009;10:137.
The problem of finding the shortest absent words in DNA data has been recently addressed, and algorithms for its solution have been described. It has been noted that longer absent words might also be of interest, but the existing algorithms only provide generic absent words by trivially extending the shortest ones.
We show how absent words relate to the repetitions and structure of the data, and define a new and larger class of absent words, called minimal absent words, that still captures the essential properties of the shortest absent words introduced in recent works. The words of this new class are minimal in the sense that if their leftmost or rightmost character is removed, then the resulting word is no longer an absent word. We describe an algorithm for generating minimal absent words that, in practice, runs in approximately linear time. An implementation of this algorithm is publicly available at .
Because the set of minimal absent words that we propose is much larger than the set of the shortest absent words, it is potentially more useful for applications that require a richer variety of absent words. Nevertheless, the number of minimal absent words is still manageable since it grows at most linearly with the string size, unlike generic absent words that grow exponentially. Both the algorithm and the concepts upon which it depends shed additional light on the structure of absent words and complement the existing studies on the topic.
PMCID: PMC2698904  PMID: 19426495
20.  Reading Speed Benefits from Increased Vertical Word Spacing in Normal Peripheral Vision 
Crowding, the adverse spatial interaction due to proximity of adjacent targets, has been suggested as an explanation for slow reading in peripheral vision. The purposes of this study were to (1) demonstrate that crowding exists at the word level and (2) examine whether or not reading speed in central and peripheral vision can be enhanced with increased vertical word spacing.
Five normal observers read aloud sequences of six unrelated four-letter words presented on a computer monitor, one word at a time, using rapid serial visual presentation (RSVP). Reading speeds were calculated based on the RSVP exposure durations yielding 80% correct. Testing was conducted at the fovea and at 5° and 10° in the inferior visual field. Critical print size (CPS) for each observer and at each eccentricity was first determined by measuring reading speeds for four print sizes using unflanked words. We then presented words at 0.8× or 1.4× CPS, with each target word flanked by two other words, one above and one below the target word. Reading speeds were determined for vertical word spacings (baseline-to-baseline separation between two vertically separated words) ranging from 0.8× to 2× the standard single-spacing, as well as the unflanked condition.
At the fovea, reading speed increased with vertical word spacing up to about 1.2× to 1.5× the standard spacing and remained constant and similar to the unflanked reading speed at larger vertical word spacings. In the periphery, reading speed also increased with vertical word spacing, but it remained below the unflanked reading speed for all spacings tested. At 2× the standard spacing, peripheral reading speed was still about 25% lower than the unflanked reading speed for both eccentricities and print sizes. Results from a control experiment showed that the greater reliance of peripheral reading speed on vertical word spacing was also found in the right visual field.
Increased vertical word spacing, which presumably decreases the adverse effect of crowding between adjacent lines of text, benefits reading speed. This benefit is greater in peripheral than central vision.
PMCID: PMC2734885  PMID: 15252352
crowding; reading; peripheral vision; low vision
21.  Globally, unrelated protein sequences appear random 
Bioinformatics  2009;26(3):310-318.
Motivation: To test whether protein folding constraints and secondary structure sequence preferences significantly reduce the space of amino acid words in proteins, we compared the frequencies of four- and five-amino acid word clumps (independent words) in proteins to the frequencies predicted by four random sequence models.
Results: While the human proteome has many overrepresented word clumps, these words come from large protein families with biased compositions (e.g. Zn-fingers). In contrast, in a non-redundant sample of Pfam-AB, only 1% of four-amino acid word clumps (4.7% of 5mer words) are 2-fold overrepresented compared with our simplest random model [MC(0)], and 0.1% (4mers) to 0.5% (5mers) are 2-fold overrepresented compared with a window-shuffled random model. Using a false discovery rate q-value analysis, the number of exceptional four- or five-letter words in real proteins is similar to the number found when comparing words from one random model to another. Consensus overrepresented words are not enriched in conserved regions of proteins, but four-letter words are enriched 1.18- to 1.56-fold in α-helical secondary structures (but not β-strands). Five-residue consensus exceptional words are enriched for α-helix 1.43- to 1.61-fold. Protein word preferences in regular secondary structure do not appear to significantly restrict the use of sequence words in unrelated proteins, although the consensus exceptional words have a secondary structure bias for α-helix. Globally, words in protein sequences appear to be under very few constraints; for the most part, they appear to be random.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2852211  PMID: 19948773
22.  Modeling Open-Set Spoken Word Recognition in Postlingually Deafened Adults after Cochlear Implantation: Some Preliminary Results with the Neighborhood Activation Model 
Do cochlear implants provide enough information to allow adult cochlear implant users to understand words in ways that are similar to listeners with acoustic hearing? Can we use a computational model to gain insight into the underlying mechanisms used by cochlear implant users to recognize spoken words?
The Neighborhood Activation Model has been shown to be a reasonable model of word recognition for listeners with normal hearing. The Neighborhood Activation Model assumes that words are recognized in relation to other similar-sounding words in a listener’s lexicon. The probability of correctly identifying a word is based on the phoneme perception probabilities from a listener’s closed-set consonant and vowel confusion matrices modified by the relative frequency of occurrence of the target word compared with similar-sounding words (neighbors). Common words with few similar-sounding neighbors are more likely to be selected as responses than less common words with many similar-sounding neighbors. Recent studies have shown that several of the assumptions of the Neighborhood Activation Model also hold true for cochlear implant users.
Closed-set consonant and vowel confusion matrices were obtained from 26 postlingually deafened adults who use cochlear implants. Confusion matrices were used to represent input errors to the Neighborhood Activation Model. Responses to the different stimuli were then generated by the Neighborhood Activation Model after incorporating the frequency of occurrence counts of the stimuli and their neighbors. Model outputs were compared with obtained performance measures on the Consonant-Vowel Nucleus-Consonant word test. Information transmission analysis was used to assess whether the Neighborhood Activation Model was able to successfully generate and predict word and individual phoneme recognition by cochlear implant users.
The Neighborhood Activation Model predicted Consonant-Vowel Nucleus-Consonant test words at levels similar to those correctly identified by the cochlear implant users. The Neighborhood Activation Model also predicted phoneme feature information well.
The results obtained suggest that the Neighborhood Activation Model provides a reasonable explanation of word recognition by postlingually deafened adults after cochlear implantation. It appears that multichannel cochlear implants give cochlear implant users access to their mental lexicons in a manner that is similar to listeners with acoustic hearing. The lexical properties of the test stimuli used to assess performance are important to spoken-word recognition and should be included in further models of the word recognition process.
PMCID: PMC3432952  PMID: 12851554
Cochlear implantation; Open set; Word recognition
23.  Some Computational Analyses of the PBK Test: Effects of Frequency and Lexical Density on Spoken Word Recognition 
Ear and hearing  1999;20(4):363-371.
The Phonetically Balanced Kindergarten (PBK) Test (Haskins, Reference Note 2) has been used for almost 50 yr to assess spoken word recognition performance in children with hearing impairments. The test originally consisted of four lists of 50 words, but only three of the lists (lists 1, 3, and 4) were considered “equivalent” enough to be used clinically with children. Our goal was to determine if the lexical properties of the different PBK lists could explain any differences between the three “equivalent” lists and the fourth PBK list (List 2) that has not been used in clinical testing.
Word frequency and lexical neighborhood frequency and density measures were obtained from a computerized database for all of the words on the four lists from the PBK Test as well as the words from a single PB-50 (Egan, 1948) word list.
The words in the “easy” PBK list (List 2) were of higher frequency than the words in the three “equivalent” lists. Moreover, the lexical neighborhoods of the words on the “easy” list contained fewer phonetically similar words than the neighborhoods of the words on the other three “equivalent” lists.
It is important for researchers to consider word frequency and lexical neighborhood frequency and density when constructing word lists for testing speech perception. The results of this computational analysis of the PBK Test provide additional support for the proposal that spoken words are recognized “relationally” in the context of other phonetically similar words in the lexicon. Implications of using open-set word recognition tests with children with hearing impairments are discussed with regard to the specific vocabulary and information processing demands of the PBK Test.
PMCID: PMC3466479  PMID: 10466571
24.  Lexical-perceptual integration influences sensorimotor adaptation in speech 
A combination of lexical bias and altered auditory feedback was used to investigate the influence of higher-order linguistic knowledge on the perceptual aspects of speech motor control. Subjects produced monosyllabic real words or pseudo-words containing the vowel [ε] (as in “head”) under conditions of altered auditory feedback involving a decrease in vowel first formant (F1) frequency. This manipulation had the effect of making the vowel sound more similar to [I] (as in “hid”), affecting the lexical status of produced words in two Lexical-Change (LC) groups (either changing them from real words to pseudo-words: e.g., less—liss, or pseudo-words to real words: e.g., kess—kiss). Two Non-Lexical-Change (NLC) control groups underwent the same auditory feedback manipulation during the production of [ε] real- or pseudo-words, only without any resulting change in lexical status (real words to real words: e.g., mess—miss, or pseudo-words to pseudo-words: e.g., ness—niss). The results from the LC groups indicate that auditory-feedback-based speech motor learning is sensitive to the lexical status of the stimuli being produced, in that speakers tend to keep their acoustic speech outcomes within the auditory-perceptual space corresponding to the task-related side of the word/non-word boundary (real words or pseudo-words). For the NLC groups, however, no such effect of lexical status is observed.
PMCID: PMC4029003  PMID: 24860460
speech production; sensorimotor integration; lexical effect; altered auditory feedback; language processing
25.  Neurophysiological correlates of mismatch in lexical access 
BMC Neuroscience  2005;6:64.
In the present study neurophysiological correlates related to mismatching information in lexical access were investigated with a fragment priming paradigm. Event-related brain potentials were recorded for written words following spoken word onsets that either matched (e.g., kan – Kante [Engl. edge]), partially mismatched (e.g., kan – Konto [Engl. account]), or were unrelated (e.g., kan – Zunge [Engl. tongue]). Previous psycholinguistic research postulated the activation of multiple words in the listeners' mental lexicon which compete for recognition. Accordingly, matching words were assumed to be strongly activated competitors, which inhibit less strongly activated partially mismatching words.
ERPs for matching and unrelated control words differed between 300 and 400 ms. Difference waves (unrelated control words – matching words) replicate a left-hemispheric P350 effect in this time window. Although smaller than for matching words, a P350 effect and behavioural facilitation was also found for partially mismatching words. Minimum norm solutions point to a left hemispheric centro-temporal source of the P350 effect in both conditions. The P350 is interpreted as a neurophysiological index for the activation of matching words in the listeners' mental lexicon. In contrast to the P350 and the behavioural responses, a brain potential ranging between 350 and 500 ms (N400) was found to be equally reduced for matching and partially mismatching words as compared to unrelated control words. This latter effect might be related to strategic mechanisms in the priming situation.
A left-hemispheric neuronal network engaged in lexical access appears to be gradually activated by matching and partially mismatching words. Results suggest that neural processing of matching words does not inhibit processing of partially mismatching words during early stages of lexical identification. Furthermore, the present results indicate that neurophysiological correlates observed in fragment priming reflect different aspects of target processing that are cumulated in behavioural responses. Particularly the left-hemispheric P350 difference potential appears to be closely related to fine-grained activation differences of modality-independent representations in the listeners' mental lexicon. This neurophysiological index might guide future studies aimed at investigating neural aspects of lexical access.
PMCID: PMC1308819  PMID: 16283934

Results 1-25 (12402)