Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Modeling the internal variability of multiword expressions through a pattern-based method

Modeling the internal variability of multiword expressions through a pattern-based method Modeling the Internal Variability of Multiword Expressions through a Pattern-Based Method MALVINA NISSIM, University of Bologna ANDREA ZANINELLO, Zanichelli editore, Bologna The issue of internal variability of multiword expressions (MWEs) is crucial towards their identification and extraction in running text. We present a corpus-supported and computational study on Italian MWEs, aimed at defining an automatic method for modeling internal variation, exploiting frequency and part-of-speech (POS) information. We do so by deriving an XML-encoded lexicon of MWEs based on a manually compiled dictionary, which is then projected onto a a large corpus. Since a search for fixed forms suffers from low recall, while an unconstrained flexible search for lemmas yields a loss in precision, we suggest a procedure aimed at maximizing precision in the identification of MWEs within a flexible search. Our method builds on the idea that internal variability can be modelled via the novel introduction of variation patterns, which work over POS patterns, and can be used as working tools for controlling precision. We also compare the performance of variation patterns to that of association measures, and explore the possibility of using variation patterns in MWE extraction in addition to identification. Finally, we suggest that corpus-derived, pattern-related http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Speech and Language Processing (TSLP) Association for Computing Machinery

Modeling the internal variability of multiword expressions through a pattern-based method

Loading next page...
 
/lp/association-for-computing-machinery/modeling-the-internal-variability-of-multiword-expressions-through-a-F80qeQ20yA

References (61)

Publisher
Association for Computing Machinery
Copyright
Copyright © 2013 by ACM Inc.
ISSN
1550-4875
DOI
10.1145/2483691.2483696
Publisher site
See Article on Publisher Site

Abstract

Modeling the Internal Variability of Multiword Expressions through a Pattern-Based Method MALVINA NISSIM, University of Bologna ANDREA ZANINELLO, Zanichelli editore, Bologna The issue of internal variability of multiword expressions (MWEs) is crucial towards their identification and extraction in running text. We present a corpus-supported and computational study on Italian MWEs, aimed at defining an automatic method for modeling internal variation, exploiting frequency and part-of-speech (POS) information. We do so by deriving an XML-encoded lexicon of MWEs based on a manually compiled dictionary, which is then projected onto a a large corpus. Since a search for fixed forms suffers from low recall, while an unconstrained flexible search for lemmas yields a loss in precision, we suggest a procedure aimed at maximizing precision in the identification of MWEs within a flexible search. Our method builds on the idea that internal variability can be modelled via the novel introduction of variation patterns, which work over POS patterns, and can be used as working tools for controlling precision. We also compare the performance of variation patterns to that of association measures, and explore the possibility of using variation patterns in MWE extraction in addition to identification. Finally, we suggest that corpus-derived, pattern-related

Journal

ACM Transactions on Speech and Language Processing (TSLP)Association for Computing Machinery

Published: Jun 1, 2013

There are no references for this article.