In:Computational Phraseology
Edited by Gloria Corpas Pastor and Jean-Pierre Colson
[IVITRA Research in Linguistics and Literature 24] 2020
► pp. 207–224
Verbal collocations and pronominalisation
Published online: 8 May 2020
https://doi.org/10.1075/ivitra.24.11weh
https://doi.org/10.1075/ivitra.24.11weh
Abstract
Precise identification of multiword expressions (MWEs) is an
important qualitative step for several NLP applications, including machine
translation. Since most MWEs cannot be translated literally, failure to
identify them yields, at best, inaccurate translation. While some
expressions are completely frozen and thus can be listed as compound words,
others display a sometimes very large degree of syntactic flexibility.
In this chapter, we argue not only that structural information is
necessary for an adequate treatment of collocations, but also that the
detection of collocations can be useful for the parser. For instance, it is
very useful for solving part-of-speech ambiguities and also some attachment
ambiguities. We therefore claim that collocation identification and parsing
are interrelated processes.
Section 2 describes the
two processes of parsing and collocation detection and their interaction,
(i) when and how the collocation identification process is triggered during
parsing, and (ii) how the identification of a collocation helps the parser.
In Section 3 we describe how
anaphora resolution has been implemented in our parsing system, to handle
cases where the antecedent and the pronoun are within the same sentence or
in adjacent sentences. Section 4
focuses on more intricate cases of verbal collocations where their nominal
element has been pronominalised, in the form of a relative pronoun or a
personal pronoun. Verb-object collocations with a relative pronoun are
extremely frequent and relatively easy to handle for a “deep” parser. In
most cases, the relative clause is directly attached to the noun which is
part of the collocation. Collocations in which the nominal element takes the
form of a personal pronoun are much harder to deal with, as they depend on
the process of anaphora resolution, a very challenging task. The last
section describes an evaluation of the collocation detection procedure,
enhanced with anaphora resolution using a corpus of newspaper articles of
about 10 million words.
Article outline
- 1.Introduction
- 2.Parsing and collocation detection
- 3.Anaphora resolution
- 4.Verbal collocations and pronominalisation
- 5.Experimental results
- 5.1Evaluation methodology
- 5.2Evaluation results
- 6.Conclusion
Notes References
References (22)
Butt, M., King, T. H., Niño, M.-E., & Segond, F. (1999). A Grammar Writers Cookbook. Stanford: CSLI Publications.
Chomsky, N. (1977). On wh-movement. In P. Culicover, T. Wasow, & A. Akmajian (Eds.), Formal Syntax. Academic Press.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Grosz, B., Joshi, A., & Weinstein, S. (1995). Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics, 21(2), 203–225.
Heid, U. (2004). On the presentation of collocations in monolingual dictionaries. In Proceedings of the eleventh EURALEX International Congress Vol. II (pp. 729–738). Lorient, France.
Kibble, R. (2001). A Reformulation of Rule 2 of Centering Theory. Computational Linguistics, 27(4), 579–587.
Lappin, S., & Leass, H. (1994). An algorithm for pronominal anaphora resolution, Computational Linguistics, 20(4), 535–561.
Nerima, L., & Wehrli, E. (2015). Résolution d’anaphores appliquée aux collocations: une évaluation préliminair. In Actes de la 20e conférence sur le Traitement Automatique des Langues Naturelles (pp. 772–778). Les Sables d’Olonne, France.
Sag I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword Expressions: A Pain in the Neck for NLP. In A. Gelbukh (Ed.), CICLING02: Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing (pp. 1–15). Springer.
Seretan, V., & Wehrli, E. (2009) Multilingual collocation extraction with a syntactic parser. Language Resources and Evaluation. Special Issue on Multilingual Language Resources and Interoperability, 43(1), 71–85.
Stone, M., & Doran, C. (1996). Paying heed to collocations. In Proceedings of the Eighth International Workshop on Natural Language Generation (pp. 91–100). Herstmonceux, Sussex, England.
Wehrli, E. (2007). Fips, a “deep” linguistic multilingual parser. In Proceedings of the ACL 2007 Workshop on Deep Linguistic Processing (pp. 120–127). Prague, Czech-Republic.
(2014). The relevance of collocations for parsing. In Proceedings of the Workshop on Multiword Expressions (pp. 26–32). Gothenburg: EACL.
Wehrli, E., Seretan, V., & Nerima, L. (2010). Sentence analysis and collocation identification. In Proceedings of the Workshop on Multiword Expressions: from Theory to Application (pp. 27–35). Beijing: Coling.
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 12 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
