Article published In: Lingvisticæ Investigationes
Vol. 26:2 (2004) ► pp.187–202
Automatic classification of multi-word expressions in print dictionaries
Published online: 30 July 2004
https://doi.org/10.1075/li.26.2.03gey
https://doi.org/10.1075/li.26.2.03gey
Summary
This work demonstrates the assignment of multi-word expressions in print dictionaries to POS classes with minimal linguistic resources. In this application, 32,000 entries from the Wörterbuch der deutschen Idiomatik (H. Schemann 1993) were classified using an inductive description of POS sequences in conjunction with a Brill Tagger trained on manually tagged idiomatic entries. This process assigned categories to 86% of entries with 88% accuracy. This classification supplies a meaningful preprocessing step for further applications: the resulting POS-sequences for all idiomatic entries might be used for the automatic recognition of multi-word lexemes in unrestricted text.
Cited by (3)
Cited by three other publications
Geyken, Alexander
Boyd‐Graber, Jordan, Kimberly Glasgow & Jackie Sauter Zajac
This list is based on CrossRef data as of 25 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
