Access the full text.
Sign up today, get DeepDyve free for 14 days.
Slav Petrov, D. Klein (2007)
Improved Inference for Unlexicalized Parsing
W. Chapman, P. Nadkarni, L. Hirschman, Leonard D'Avolio, G. Savova, Özlem Uzuner (2011)
Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutionsJournal of the American Medical Informatics Association : JAMIA, 18 5
R. Taira, Vijayaraghavan Bashyam, H. Kangarloo (2007)
A Field Theoretical Approach to Medical Natural Language ProcessingIEEE Transactions on Information Technology in Biomedicine, 11
Mitchell Marcus, Beatrice Santorini, Mary Marcinkiewicz (1993)
Building a Large Annotated Corpus of English: The Penn TreebankComput. Linguistics, 19
P. Kantor (2001)
Foundations of Statistical Natural Language ProcessingInformation Retrieval, 4
Özlem Uzuner (2009)
Viewpoint Paper: Recognizing Obesity and Comorbidities in Sparse DataJ. Am. Medical Informatics Assoc., 16
Eugene Charniak, Mark Johnson (2005)
Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking
C. Friedman, Pauline Kra, A. Rzhetsky (2002)
Two biomedical sublanguages: a description based on the theories of Zellig HarrisJournal of biomedical informatics, 35 4
N. Sager, L. Hirschman (1982)
Chapter 2. Automatic Information Formatting of a Medical Sublanguage
Jung-wei Fan, C. Friedman (2011)
Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologiesJournal of biomedical informatics, 44 5
Yang Huang, H. Lowe (2007)
Research Paper: A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology ReportsJournal of the American Medical Informatics Association : JAMIA, 14 3
D. Albright, Arrick Lanfranchi, Anwen Fredriksen, IV WilliamF.Styler, Colin Warner, Jena Hwang, Jinho Choi, Dmitriy Dligach, Rodney Nielsen, James Martin, Wayne Ward, Martha Palmer, G. Savova (2013)
Towards comprehensive syntactic and semantic annotations of the clinical narrativeJournal of the American Medical Informatics Association : JAMIA, 20
(2012)
Bracketing biomedical text: an addendum to Penn Treebank II guidelines. Institute of Cognitive Science, University of Colorado at Boulder
S. Meystre, G. Savova, K. Kipper-Schuler, John Hurdle (2008)
Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent ResearchYearbook of Medical Informatics, 17
D. Klein, Christopher Manning (2003)
Accurate Unlexicalized Parsing
P. Nadkarni, L. Ohno-Machado, W. Chapman (2011)
Natural language processing: an introductionJournal of the American Medical Informatics Association : JAMIA, 18 5
Jung-wei Fan, R. Prasad, Rommel Yabut, R. Loomis, D. Zisook, J. Mattison, Y. Huang (2011)
Part-of-speech tagging for clinical text: wall or bridge between institutions?AMIA ... Annual Symposium proceedings. AMIA Symposium, 2011
Yuka Tateisi, Akane Yakushiji, Tomoko Ohta, Junichi Tsujii (2005)
Syntax Annotation for the GENIA Corpus
C. Friedman, P. Alderson, J. Austin, J. Cimino, Stephen Johnson (1994)
Research Paper: A General Natural-language Text Processor for Clinical RadiologyJournal of the American Medical Informatics Association : JAMIA, 1 2
Jennifer Foster (2007)
Treebanks Gone Bad Parser Evaluation and Retraining using a Treebank of Ungrammatical Sentences
Jiaping Zheng, W. Chapman, Timothy Miller, Chen Lin, R. Crowley, G. Savova (2012)
A system for coreference resolution for the clinical narrativeJournal of the American Medical Informatics Association : JAMIA, 19 4
Stephan Kepser, I. Steiner, W. Sternefeld (2004)
Annotating and Querying a Treebank of Suboptimal Structures
G. Chung (2009)
Towards identifying intervention arms in randomized controlled trials: Extracting coordinating constructionsJournal of biomedical informatics, 42 5
Ann Bies, Mark Ferguson, Karen Katz, R. MacIntyre, Victoria Tredinnick, Grace Kim, Mary Marcinkiewicz, Britta Schasberger (2002)
Bracketing Guidelines for Treebank II Style
T. Morton, J. LaCivita (2003)
WordFreak: An Open Tool for Linguistic Annotation
Özlem Uzuner, I. Solti, Eithon Cadag (2010)
Extracting medication information from clinical textJournal of the American Medical Informatics Association : JAMIA, 17 5
Yusuke Miyao, Kenji Sagae, Rune Sætre, Takuya Matsuzaki, Jun'ichi Tsujii (2005)
Data and text mining
Karen Jensen, George Heidorn, L. Miller, Yael Ravin (1983)
Parse Fitting and Prose Fixing: Getting a Hold on III-FormednessAm. J. Comput. Linguistics, 9
S. Rea, Jyotishman Pathak, G. Savova, T. Oniki, Les Westberg, C. Beebe, C. Tao, C. Parker, P. Haug, S. Huff, C. Chute (2012)
Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn projectJournal of biomedical informatics, 45 4
Zach Solan, D. Horn, E. Ruppin, S. Edelman (2009)
Unsupervised learning of natural languagesProceedings of the National Academy of Sciences of the United States of America, 102 33
Yang Huang, H. Lowe, D. Klein, R. Cucina (2005)
Application of Information Technology: Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist LexiconJ. Am. Medical Informatics Assoc., 12
U. Hahn, M. Romacker, S. Schulz (2002)
MedSynDikate - a natural language system for the extraction of medical information from findings reportsInternational journal of medical informatics, 67 1-3
R. Kittredge, John Lehrberger (1982)
Sublanguage : studies of language in restricted semantic domains
Yue Wang, M. Halper, D. Wei, Huanying Gu, Y. Perl, Junchuan Xu, Gai Elhanan, Yan Chen, K. Spackman, James Case, G. Hripcsak (2012)
Auditing complex concepts of SNOMED using a refined hierarchical abstraction networkJournal of biomedical informatics, 45 1
G. Hripcsak, A. Rothschild (2005)
Technical Brief: Agreement, the F-Measure, and Reliability in Information RetrievalJournal of the American Medical Informatics Association : JAMIA, 12 3
Peter Spyns (1996)
Natural Language Processing in Medicine: An OverviewMethods of Information in Medicine, 35
J. Carbonell, P. Hayes (1983)
Recovery Strategies for Parsing Extragrammatical LanguageAm. J. Comput. Linguistics, 9
Z. Harris (1952)
Methods in structural linguistics.Modern Language Notes, 68
Özlem Uzuner, B. South, Shuying Shen, S. Duvall (2011)
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical textJournal of the American Medical Informatics Association : JAMIA, 18 5
M. Marneffe, Bill MacCartney, Christopher Manning (2006)
Generating Typed Dependency Parses from Phrase Structure Parses
Eugene Charniak (2000)
A Maximum-Entropy-Inspired Parser
Elizabeth Shriberg (1996)
DISFLUENCIES IN SWITCHBOARD
AbstractObjective To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest.Methods Using random samples from a shared natural language processing challenge dataset, we developed a handbook of domain-customized syntactic parsing guidelines based on iterative annotation and adjudication between two institutions. Special considerations were incorporated into the guidelines for handling ill-formed sentences, which are common in clinical text. Intra- and inter-annotator agreement rates were used to evaluate consistency in following the guidelines. Quantitative and qualitative properties of the annotated Treebank, as well as its use to retrain a statistical parser, were reported.Results A supplement to the Penn Treebank II guidelines was developed for annotating clinical sentences. After three iterations of annotation and adjudication on 450 sentences, the annotators reached an F-measure agreement rate of 0.930 (while intra-annotator rate was 0.948) on a final independent set. A total of 1100 sentences from progress notes were annotated that demonstrated domain-specific linguistic features. A statistical parser retrained with combined general English (mainly news text) annotations and our annotations achieved an accuracy of 0.811 (higher than models trained purely with either general or clinical sentences alone). Both the guidelines and syntactic annotations are made available at https://sourceforge.net/projects/medicaltreebank.Conclusions We developed guidelines for parsing clinical text and annotated a corpus accordingly. The high intra- and inter-annotator agreement rates showed decent consistency in following the guidelines. The corpus was shown to be useful in retraining a statistical parser that achieved moderate accuracy.
Journal of the American Medical Informatics Association – Oxford University Press
Published: Nov 1, 2013
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.