Enter Your Search:
Results 1-3 (3)
Go to page number:
Select a Filter Below
Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc (1)
Journal of child language (1)
Sagae, Kenji (2)
DAVIS, ERIC (1)
LAVIE, ALON (1)
Lavie, Alon (1)
MACWHINNEY, BRIAN (1)
MacWhinney, Brian (1)
Matsuzaki, Takuya (1)
Miyao, Yusuke (1)
SAGAE, KENJI (1)
Sætre, Rune (1)
Tsujii, Jun'ichi (1)
WINTNER, SHULY (1)
Year of Publication
Did you mean:
Morphosyntactic annotation of CHILDES transcripts*
Journal of child language
Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes.
Evaluating contributions of natural language parsers to protein–protein interaction extraction
Motivation: While text mining technologies for biomedical research have gained popularity as a way to take advantage of the explosive growth of information in text form in biomedical papers, selecting appropriate natural language processing (NLP) tools is still difficult for researchers who are not familiar with recent advances in NLP. This article provides a comparative evaluation of several state-of-the-art natural language parsers, focusing on the task of extracting protein–protein interaction (PPI) from biomedical papers. We measure how each parser, and its output representation, contributes to accuracy improvement when the parser is used as a component in a PPI system.
Results: All the parsers attained improvements in accuracy of PPI extraction. The levels of accuracy obtained with these different parsers vary slightly, while differences in parsing speed are larger. The best accuracy in this work was obtained when we combined Miyao and Tsujii's Enju parser and Charniak and Johnson's reranking parser, and the accuracy is better than the state-of-the-art results on the same data.
Availability: The PPI extraction system used in this work (AkanePPI) is available online at http://www-tsujii.is.s.u-tokyo.ac.jp/-100downloads/downloads.cgi. The evaluated parsers are also available online from each developer's site.
Automatic Parsing of Parental Verbal Input
Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc
To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity trade-off in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet.
Results 1-3 (3)
Go to page number:
Remove citation from clipboard
Add citation to clipboard
This will clear all selections from your clipboard. Do you wish proceed?
Clipboard is full! Please remove an item and try again.
PubMed Central Canada is a service of the
Canadian Institutes of Health Research
(CIHR) working in partnership with the National Research Council's
national science library
in cooperation with the
National Center for Biotechnology Information
U.S. National Library of Medicine
(NCBI/NLM). It includes content provided to the
PubMed Central International archive
by participating publishers.