In:Word Classes: Nature, typology and representations
Edited by Raffaele Simone and Francesca Masini
[Current Issues in Linguistic Theory 332] 2014
► pp. 17–36
Carving verb classes from corpora
Published online: 24 September 2014
https://doi.org/10.1075/cilt.332.02len
https://doi.org/10.1075/cilt.332.02len
In this paper, I discuss some methodological problems arising from the use of corpus data for semantic verb classification. In particular, I present a computational framework to describe the distributional properties of Italian verbs using linguistic data automatically extracted from a large corpus. This information is used to build a distribution-based classification of a set of Italian verbs. It is small scale notwithstanding, this case study will provide evidence for the complex interplay between syntactic and semantic verb features.
References (33)
Attardi, Giuseppe & Felice Dell’Orletta. 2009. “Reverse Revision and Linear Tree Combination for Dependency Parsing”.
Proceedings of NAACL-HLT 2009
(Boulder, Colorado, June 2009). 261–264.
Baroni, Marco, Silvia Bernardini, Federica Comastri, Lorenzo Piccioni, Alessandra Volpi, Guy Aston & Marco Mazzoleni. 2004. “Introducing the ‘la Repubblica’ Corpus: A large, annotated, TEI(XML)-compliant corpus of newspaper Italian”.
Proceedings of LREC 2004
(Lisboa, May 2004). 1771–1774.
Čulo, Oliver, Katrin Erk, Sebastian Padó & Sabine Schulte im Walde. 2008. “Comparing and Combining Semantic Verb Classifications”. Language Resources and Evaluation 42:3.265–291.
Erk, Katrin. 2007. “A Simple, Similarity-Based Model for Selectional Preferences”.
Proceedings of ACL2007
(Prague, June 2007). 216–223.
Evert, Stefan. 2008. “Corpora and Collocations”. Corpus Linguistics. An international handbook ed. by Anke Lüdeling & Merja Kytö, 1212–1248. Berlin & New York: Mouton de Gruyter.
Fellbaum, Christiane, ed. 1998. WordNet: An electronic lexical database. Cambridge, Mass.: MIT Press.
Fillmore, Charles, Christopher Johnson & Miriam Petruck. 2003. “Background to Framenet”. International Journal of Lexicography 16:3.235–250.
Hanks, Patrick. 1996. “Contextual Dependency and Lexical Sets”. International Journal of Corpus Linguistics 1:1.75–98.
Hanks, Patrick & James Pustejovsky. 2005. “A Pattern Dictionary for Natural Language Processing”. Revue Française de Linguistique Appliquée 10:2.63–82.
Harris, Zellig S. 1954. “Distributional Structure”. Word 10:2–3.146–162 [reprinted in Harris, Zellig S. 1970. Papers in Structural and Transformational Linguistics, 775–794. Dordrecht: Reidel].
Kipper-Schuler, Karin. 2005. VerbNet: A broad-coverage, comprehensive verb lexicon. Ph.D. dissertation, University of Pennsylvania.
Kipper-Schuler, Karin, Anna Korhonen, Neville Ryant & Martha Palmer. 2008. “A Large-Scale Classification of English Verbs”. Journal of Language Resources and Evaluation 42:1.21–40.
Korhonen, Anna. 2009. “Automatic Lexical Classification: Balancing between machine learning and linguistics”.
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation
, Hong Kong.
Joanis, Eric, Suzanne Stevenson & David James. 2008. “A General Feature Space for Automatic Verb Classification”. Natural Language Engineering 14:3.337–367.
Lapata, Mirella & Chris Brew. 2004. “Verb Class Disambiguation Using Informative Priors”. Computational Linguistics 3:1.45–73.
Lenci, Alessandro, Nuria Bel, Federica Busa, Nicoletta Calzolari, Elisabetta Gola, Monica Monachini, Antoine Ogonowsky, Ivonne Peters, Wim Peters, Nilda Ruimy, Marta Villegas & Antonio Zampolli. 2000. “SIMPLE: A general framework for the development of multilingual lexicons”. International Journal of Lexicography 13:4.249–263.
Lenci, Alessandro. 2008. “Distributional Semantics in Linguistic and Cognitive Research”. Italian Journal of Linguistics 20:1.1–31.
Levin, Beth. 1993. English Verb Classes and Alternation: A preliminary investigation. Chicago: University of Chicago Press.
Li, Janguo & Chris Brew. 2008. “Which Are the Best Features for Automatic Verb Classification”.
Proceedings of ACL2008
(Columbus, Ohio, June 2008). 434–442.
Light, Mark & Warren Greiff. 2002. “Statistical Models for the Induction and Use of Selectional Preferences”. Cognitive Science 26.269–281.
Manning, Christopher D. & Hinrich Schütze. 1999. Foundations of Statistical Language Processing. Cambridge, Mass.: MIT Press.
McCarthy, Diana. 2001. Lexical Acquisition at the Syntax-Semantics Interface: Diathesis alternations, subcategorization frames and selectional preferences. Ph.D. dissertation, University of Sussex.
Merlo, Paola & Eva Esteve Ferrer. 2006. “The Notion of Argument in Prepositional Phrase Attachment”. Computational Linguistics 32:3.341–377.
Merlo, Paola & Stevenson Suzanne. 2001. “Automatic Verb Classification Based on Statistical Distributions of Argument Structure”. Computational Linguistics 27:3.373–408.
Miller, George A. & Walter G. Charles. 1991. “Contextual Correlates of Semantic Similarity”. Language and Cognitive Processes 6.1–28.
Pianta, Emanuele, Luisa Bentivogli & Christian Girardi. 2002. “MultiWordNet: Developing an aligned multilingual database”.
Proceedings of the 1st International WordNet Conference
(Mysore, India, January 2002). 293–302.
Rappaport Hovav, Malka & Beth Levin. 1998. “Building Verb Meanings”. The Projection of Arguments ed. by Miriam Butt & Wilhem Geuder, 97–134. Stanford, CA: CSLI Publications.
Roventini, Adriana, Antonietta Alonge, Nicoletta Calzolari, Bernardo Magnini & Francesca Bertagna. 2000. “ItalWordNet: A large semantic database for Italian”.
Proceedings of LREC 2000
(Athens, May/June 2000), vol. II, 783–790.
Schulte im Walde, Sabine. 2006. “Experiments on the Automatic Induction of German Semantic Verb Classes”. Computational Linguistics 32:2.159–194.
. 2009. “The Induction of Verb Frames and Verb Classes from Corpora”. Corpus Linguistics. An international handbook ed. by Anke Lüdeling & Merja Kytö, 952–972. Berlin & New York: Mouton de Gruyter.
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 15 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
