Article published In: Language and Linguistics
Vol. 19:1 (2018) ► pp.61–79
Filtered collocations as features in verbal polysemy disambiguation
A case study of the Chinese verb kao ‘bake’
Available under the Creative Commons Attribution (CC BY) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
Published online: 5 January 2018
https://doi.org/10.1075/lali.00003.cha
https://doi.org/10.1075/lali.00003.cha
Abstract
In Generative Lexicon Theory (glt) (. 1995. The generative lexicon. Cambridge: The MIT Press.), co-composition is one of the generative devices proposed to explain the cases of verbal polysemous behavior where more than one function application is allowed. The English baking verbs were used as examples to illustrate how their arguments co-specify the verb with qualia unification. Some studies (Blutner, Reinhard. 2002. Lexical semantics and pragmatics. Linguistische Berichte 101. 27–58.; Carston, Robyn. 2002. Thoughts and utterances: The pragmatics of explicit communication. Oxford: Blackwell. ; Falkum, Ingrid Lossius. 2007. Generativity, relevance and the problem of polysemy. UCL Working Papers in Linguistics 191. 205–234.) stated that the information of pragmatics and world knowledge need to be considered as well. Therefore, this study would like to examine whether glt could be practiced in a real-world Natural Language Processing (nlp) application using collocations. We have conducted a fine-grained logical polysemy disambiguation task, taking the open-sourced Leiden Weibo Corpus as resource and computing with Support Vector Machine (svm) classifier. Within the classifier, we have taken collocated verbs under glt as main features. In addition, measure words and syntactic patterns are extracted as additional features for comparison. Our study investigates the logical polysemy of the Chinese verb kao ‘bake’. We find that glt could help in identifying logically polysemous cases; additional features would help the classifier achieve a higher performance.
Article outline
- 1.Introduction
- 2.Co-composition and qualia structure in glt
- 3.Methodology
- 3.1Data collection
- 3.2Filtering nouns as seeds
- 3.3Selection of collocated features: verbs, measure words and syntactic constructions
- 3.3.1Collocated verbs and measure words
- 3.3.2Collocated syntactic patterns
- 3.4Constructing a data frame with features for svm classification
- 3.5 svm classification
- 4.Analysis and discussion
- 4.1The collocated verbs with measure words
- 4.1.1Features for nouns with change of state senses
- 4.1.2Features for nouns with creation senses
- 4.2.3Others
- 4.2Pattern 3+4
- 4.1The collocated verbs with measure words
- 5.Conclusion
- Acknowledgements
- Notes
- Abbreviations
References
References (24)
Atkins, Beryl T. & Kegl, Judy & Levin, Beth. 1988. Anatomy of a verb entry: From linguistic theory to lexicographic practice. International Journal of Lexicography 11. 84–126.
Carston, Robyn. 2002. Thoughts and utterances: The pragmatics of explicit communication. Oxford: Blackwell.
Chinese Knowledge Information Processing Group. 1993. Technical report no. 93–05: Zhongwen cilei fenxi (sanban) [
The analysis of Chinese parts of speech (the third version)
]. Taipei: Institute of Information Science, Academia Sinica.
Davis, Anthony R. & Koenig, Jean-Pierre. 2000. Linking as constraints on word classes in a hierarchical lexicon. Language 76(1). 56–91.
Falkum, Ingrid Lossius. 2007. Generativity, relevance and the problem of polysemy. UCL Working Papers in Linguistics 191. 205–234.
Fodor, Jerry A. & Lepore, Ernie. 1998. The emptiness of the lexicon: Reflections on James Pustejovsky’s ‘The generative lexicon’. Linguistic Inquiry 29(2). 269–288.
Gale, William A. & Church, Kenneth W. & Yarowsky, David. 1993. A method for disambiguating word senses in a large corpus. Computers and the Humanities 261. 415–439.
Jackendoff, Ray. 2002. Foundations of language: Brain, meaning, grammar, evolution. Oxford: Oxford University Press.
Kamp, Hans & Partee, Barbara. 1995. Prototype theory and compositionality. Cognition 57(2). 129–191.
Levin, Beth & Rappaport Hovav, Malka. 2005. Argument realization. Cambridge: Cambridge University Press.
Moravcsik, Julius M. 1975. Aitia as generative factor in Aristotle’s philosophy. Dialogue 141. 622–636.
Partee, Barbara H. 1992. Syntactic categories and semantic type. In Rosner, Michael & Johnson, Roderick (eds.), Computational linguistics and formal semantics, 97–126. Cambridge: Cambridge University Press.
Princeton University: About WordNet. 2010–2016. WordNet. ([URL]) (Accessed 2014-09-01.)
Sperber, Dan & Wilson, Deirdre. 1995. Relevance: Communication and cognition. 2nd edn. Oxford: Blackwell.
van Esch, Daan. 2012. Leiden Weibo Corpus. ([URL]) (Accessed 2014-09-03.)
van ValinJr., Robert D. 2005. Exploring the syntax-semantics interface. Cambridge: Cambridge University Press.
