1. Chang JT, Altman RB. Promises of text processing: natural language processing meets AI Drug Discov Today 2002;7:992-993. [PubMed] 2. Lovis C, Baud RH. Fast exact string pattern matching algorithms adapted to the characteristics of the medical language J Am Med Inform Assoc 2000;7:378-391. [PMC free article] [PubMed] 3. Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural language text processor for clinical radiology J Am Med Inform Assoc 1994;1:161-174. [PMC free article] [PubMed] 4. Haug PJ, Koehler S, Lau LM, Wang P, Rocha R, Huff SM. Experience with a mixed semantic/syntactic parser Proc Annu Symp Comput Appl Med Care 1995;19:284-288. [PMC free article] [PubMed] 5. Goryachev S, Sordo M, Zeng QT. A suite of natural language processing tools developed for the i2b2 project AMIA Annu Symp Proc 2006:931.
6. Bates W, Evans RS, Murff H, Stetson PD, Pizziferri L, Hripcsak G. Detecting adverse events using information technology J Am Med Inform Assoc 2003;10:115-128. [PMC free article] [PubMed] 7. Westbrook JI, Coiera EW, Gosling AS. Do online information retrieval systems help experienced clinicians answer clinical questions? J Am Med Inform Assoc 2005;12:315-321. [PMC free article] [PubMed] 8. Melton GB, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries J Am Med Inform Assoc 2005;12:448-457. [PMC free article] [PubMed] 9. Uzuner Ö, Luo Y, Szolovits P. Evaluating the state of the art in automatic de-identification J Am Med Inform Assoc 2007;14:550-563. [PMC free article] [PubMed] 10. Grishman R, Sundheim B. Message Understanding Conference 6: a brief history16th Conference on Computational Linguistics; 1996. Copenhagen, Denmark: Association for Computational Linguistics; 1996. pp. 466-471.
11. NISThttp://www.nist.gov/speech/tests/ 2005. Accessed February 15, 2007.
12. Krallinger M. BioCreAtIvEhttp://biocreative.sourceforge.net/ 2006. Accessed February 15, 2007.
13. Hersh WR, Muller H, Jensen JR, Yang J, Gorman PN, Ruch P. Advancing biomedical image retrieval: development and analysis of a test collection J Am Med Inform Assoc 2006;13:488-496. [PMC free article] [PubMed] 14. Braschler M, Peters C. Cross language evaluation forum: objectives, results, achievements Information Retrieval 2004;7:7-31.
15. Hersh W, Bhupatiraju RT, Corley S. Enhancing access to the Bibliome: the TREC Genomics track MedInfo 2004;11:773-777.
16. Sparck Jones K. Reflections on TREC Info Process Manage 1995;31:291-314.
17. Hirschman L. The evolution of evaluation: lessons from the message understanding conferences Comput Speech Lang 1998;12:281-305.
18. Hirschman L, Yeh A, Blaschke C, Valencia A. Overview of BioCreAtIvE: critical assessment of information extraction for biology BMC Bioinform 2005;6:S1.
19. Chapman WW, Christensen LM, Wagner MM, et al. Classifying free text triage chief complaints into syndromic categories with natural language processing Artif Intell Med 2005;33:31-40. [PubMed] 20. Chapman WW, Dowling JN, Wagner MM. Classification of emergency department chief complaints into 7 syndromes: a retrospective analysis of 527,228 patients Ann Emerg Med 2005;46:445-455. [PubMed] 21. Chute CG. Clinical classification and terminology: some history and current observations J Am Med Inform Assoc 2000;7:298-303. [PMC free article] [PubMed] 22. Huang Y, Lowe HJ. A grammar based classification of negations in clinical radiology reports Proc AMIA Annu Fall Symp 2005:988.
23. Sibanda T, He T, Szolovits P, Uzuner Ö. Syntactically informed semantic category recognizer for discharge summaries Proc AMIA Annu Fall Symp 2006:714-718.
24. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries J Biomed Inform 2001;34:301-310. [PubMed] 25. Hellesø R. Information handling in the nursing discharge note J Clin Nurs 2006;15:11-21. [PubMed] 26. Hripcsak G, Zhou L, Parsons S, Das AK, Johnson SB. Modeling electronic discharge summaries as a simple temporal constraint satisfaction problem J Am Med Inform Assoc 2005;12:55-63. [PMC free article] [PubMed] 27. Kukafka R, Bales ME, Burkhardt A, Friedman C. Human and automated coding of rehabilitation discharge summaries according to the international classification of functioning, disability, and health J Am Med Inform Assoc 2006;13:508-515. [PMC free article] [PubMed] 28. Liu H, Friedman C. CliniViewer: a tool for viewing electronic medical records based on natural language processing and XML MedInfo 2004;11:639-643.
29. Zhou L, Melton GB, Parsons S, Hripcsak G. A temporal constraint structure for extracting temporal information from clinical narrative J Biomed Inform 2006;39:424-439. [PubMed] 30. Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system BMC Med Inform Decis Mak 2006;6:30. [PMC free article] [PubMed] 31. Cohen J. A coefficient of agreement for nominal scales Educ Psychol Meas 1960;20:37-46.
32. What is Kappa November 25, 2000http://www.musc.edu/dc/icrebm/kappa.html 1960. Accessed February 13, 2007.
33. Hripcsak G, Heitjan DF. Measuring agreement in medical informatics reliability studies J Biomed Inform 2002;35:99-110. [PubMed] 34. Krippendorff K. Content Analysis: An Introduction to Its Methodology2nd ed.. Thousand Oaks, California: Sage; 2004.
35. Congalton RG. A review of assessing the accuracy of classifications of remotely sensed data Remote Sensing Environ 1991;37:35-46.
36. Yang Y, Liu X. A re-examination of text categorization methods Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, California. New York: ACM Press; 1999. pp. 42-49.
37. Salton G, McGill MJ. Introduction to Modern Information RetrievalNew York: McGraw Hill; 1983.
38. Hripcsak G, Rothschild AS. Agreement, the F measure, and reliability in information retrieval J Am Med Inform Assoc 2005;12:296-298. [PMC free article] [PubMed] 39. Chinchor N. The Statistical Significance of the MUC 4 ResultsFourth Message Understanding Conference (MUC 4), McLean, Virginia. Morristown, NJ: Association for Computational Linguistics; 1992.
40. Aramaki E, Imai T, Miyo K, Ohe K. Patient Status Classification by Using Rule based Sentence Extraction and BM25 kNN-based Classifieri2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006.
41. Carrero FM, Gómez Hidalgo JM, Puertas E, Maña M, Mata J. Quick Prototyping of High Performance Text Classifiersi2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006.
42. Clark C, Good K, Jezierny L, Macpherson M, Wilson B, Chajewska U. Identifying smokers with a medical extraction system J Am Med Inform Assoc 2008;15:36-39. [PMC free article] [PubMed] 43. Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues BMC Bioinformatics 2006;7:334-350. [PMC free article] [PubMed] 44. Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques2nd ed.. San Francisco: Morgan Kaufmann; 2005.
45. Cohen AM. Five-way smoking status classification using text hot-spot identification and error-correcting output codes J Am Med Inform Assoc 2008;15:32-35. [PMC free article] [PubMed] 46. Gospodnetic O, Hatcher E. Lucene in ActionGreenwich, Conn: Manning Publications; 2005.
47. Guillen R. Automated De-identification and Categorization of Medical Recordsi2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006.
48. Pedersen T. Determining Smoker Status using Supervised and Unsupervised Learning with Lexical Featuresi2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006. Available as JAMIA on-line data supplement to the current article, at www.jamia.org.
49. Rekdal M. Identifying Smoking Status Using Argus MLPi2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006.
50. Savova GK, Ogren PV, Duffy PH, Buntrock JD, Chute CG. Mayo Clinic NLP system for patient smoking status identification J Am Med Inform Assoc 2008;15:25-28. [PMC free article] [PubMed] 51. Heinze DT, Morsch ML, Potter BC, Sheffer RE. A-Life Medical I2B2 NLP smoking challenge system architecture and methodology J Am Med Inform Assoc 2008;15:40-43. [PMC free article] [PubMed] 52. Szarvas G, Farkas R, Iván S, Kocsor A, Busa Fekete R. Automatic Extraction of Semantic Content from Medical Discharge Recordsi2b2 Workshop on Challenges in Natural Language Processing for Clinical Data. 2006.
53. Wicentowski R, Sydes MR. Using implicit information to identify smoking status in smoke-blind medical discharge summaries J Am Med Inform Assoc 2008;15:29-31. [PMC free article] [PubMed] 54. Miller RA. Reference standards in evaluating system performance J Am Med Inform Assoc 2002;9:87-88. [PMC free article] [PubMed]