Article published In: Terminology
Vol. 30:2 (2024) ► pp.159–189
Exploring terminological relations between multi-word terms in distributional semantic models
Published online: 27 June 2023
https://doi.org/10.1075/term.21053.wan
https://doi.org/10.1075/term.21053.wan
Abstract
A term is a lexical unit with specialized meaning in a particular domain. Terms may be simple (STs) or multi-word
(MWTs). The organization of terms gives a representation of the structure of domain knowledge, which is based on the relationships
between the concepts of the domain. However, relations between MWTs are often underrepresented in terminology resources. This work
aims to explore distributional semantic models for capturing terminological relations between multi-word terms through lexical
substitution and analogy. The experiments show that the results of the analogy-based method are globally better than those of the
one based on lexical substitution and that analogy is well suited to the acquisition of synonymy, antonymy, and hyponymy while
lexical substitution performs best for hypernymy.
Article outline
- 1.Introduction
- 2.Identification of semantic relations in DSMs
- 2.1Semantic relations acquisition using DSMs
- 2.2Semantic relation acquisition using lexical substitution
- 2.3Analogy for semantic relation extraction
- 3.Experimental framework
- 3.1Main resources
- 3.1.1Corpus
- 3.1.2Lexical relation databases
- 3.2Models for the lexical substitution and analogy methods
- 3.3Distributional semantic models
- 3.4Evaluation metrics
- 3.1Main resources
- 4.Acquisition of synonymy between biterms
- 4.1Extraction of synonymous biterms from IATE
- 4.2Acquisition of synonymy between biterms using a masked language model
- Test dataset for the MLM experiments
- Experiment
- Results
- Qualitative analysis
- 4.3Identification of synonymy between biterms by means of analogy
- Test dataset for analogy
- Experiment
- Results
- Qualitative analysis
- 5.Acquiring other types of lexical relations
- 5.1Generation of semantically related biterms by semantic projection
- 5.2Acquiring the other lexical relations by means of masked language models
- Test dataset used in the MLM experiments
- Experimentation
- Results and discussion
- 5.3Acquiring the other semantic relations by means of analogy
- Test dataset for analogy
- Experiments
- Results and discussion
- 6.Discussion
- 7.Conclusion
- Notes
References
References (69)
Allen, Carl, and Timothy Hospedales. 2019. “Analogies
Explained: Towards Understanding Word Embeddings.” In International
Conference on Machine Learning, 223–31. Long Beach, California, USA: PMLR.
Arefyev, Nikolay, Boris Sheludko, Alexander Podolskiy, and Alexander Panchenko. 2020. “A
Comparative Study of Lexical Substitution Approaches Based on Neural Language Models.” ArXiv
Preprint ArXiv:2006.00031.
Barrière, Caroline. 2004. “Knowledge-Rich
Contexts Discovery.” In Conference of the Canadian Society for
Computational Studies of Intelligence, Canadian
AI, 187–201. London, Ontario, Canada: Springer.
Bernier-Colborne, Gabriel. 2017. “Aide
à l’identification de Relations Lexicales Au Moyen de La Sémantique Distributionnelle et Son Application à Un Corpus Bilingue
Du Domaine de l’environnement.” PhD Diss., Université de
Montréal.
Bernier-Colborne, Gabriel, and Patrick Drouin. 2016. “Évaluation
des modèles sémantiques distributionnels : le cas de la dérivation syntaxique (Evaluation of distributional semantic models :
The case of syntactic derivation).” In Actes de la conférence
conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles
longs), 125–38. Paris, France: AFCP – ATALA.
Bojanowski, Piotr, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. “Enriching
Word Vectors with Subword Information.” Transactions of the Association for Computational
Linguistics 51: 135–46.
Bouraoui, Zied, Jose Camacho-Collados, and Steven Schockaert. 2020. “Inducing
Relational Knowledge from BERT.” In Proceedings of the AAAI
Conference on Artificial
Intelligence, 341:7456–63. New York, New York, United States.
Bouraoui, Zied, Shoaib Jameel, and Steven Schockaert. 2018. “Relation
Induction in Word Embeddings Revisited.” In Proceedings of the 27th
International Conference on Computational
Linguistics, 1627–37. Santa Fe, New Mexico, USA: Association for Computational Linguistics.
Bourigault, Didier. 2002. “UPERY :
Un Outil D’analyse Distributionnelle Étendue Pour La Construction D’ontologies à Partir de
Corpus.” In Actes de La 9e Conférence Sur Le Traitement Automatique
Des Langues Naturelles. Articles
Longs, 75–84. Nancy, France: ATALA.
Bullinaria, John A., and Joseph P. Levy. 2012. “Extracting
Semantic Representations from Word Co-Occurrence Statistics: Stop-Lists, Stemming, and
SVD.” Behavior Research
Methods 44 (3): 890–907.
Chaudhri, Vinay K., Justin Xu, Han Lin Aung, and Sajana Weerawardhena. 2022. “A
Corpus of Biology Analogy Questions as a Challenge for Explainable
AI.” In Bridging Human Intelligence and Artificial
Intelligence, edited by Mark V. Albert, Lin Lin, Michael J. Spector, and Lemoyne S. Dunn, 327–37. Educational
Communications and Technology: Issues and
Innovations. Cham: Springer International Publishing.
Chen, Zhiwei, Zhe He, Xiuwen Liu, and Jiang Bian. 2018. “Evaluating
Semantic Relations in Neural Word Embeddings with Biomedical and General Domain Knowledge
Bases.” BMC Medical Informatics and Decision
Making 18 (2): 53–68.
Cram, Damien, and Béatrice Daille. 2016. “Terminology
Extraction with Term Variant Detection.” In Proceedings of ACL-2016
System Demonstrations, 13–18. Berlin, Germany: Association for Computational Linguistics.
Daille, Béatrice. 2017. Term
Variation in Specialised Corpora: Characterisation, Automatic Discovery and
Applications. Vol. 191. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Daille, Béatrice, and Amir Hazem. 2014. “Semi-Compositional
Method for Synonym Extraction of Multi-Word Terms.” In Proceedings of
the Ninth International Conference on Language Resources and Evaluation
(LREC’14), 1202–7. Reykjavik, Iceland: European Language Resources Association (ELRA). [URL]
Depraetere, Ilse. 2019. “Meaning
in Context and Contextual Meaning: A Perspective on the Semantics-Pragmatics Interface Applied to Modal
Verbs.” Anglophonia. French Journal of English
Linguistics, no. 28.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT:
Pre-Training of Deep Bidirectional Transformers for Language
Understanding.” In Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short
Papers), 4171–86. Minneapolis, Minnesota: Association for Computational Linguistics.
Espinosa Anke, Luis, Joan Codina-Filba, and Leo Wanner. 2021. “Evaluating
Language Models for the Retrieval and Categorization of Lexical
Collocations.” In Proceedings of the 16th Conference of the European
Chapter of the Association for Computational Linguistics: Main
Volume, 1406–17. Online: Association for Computational
Linguistics.
Ferret, Olivier. 2021. “Exploration des relations sémantiques sous-jacentes aux plongements contextuels de
mots (Exploring semantic relations underlying contextual word
embeddings).” In Actes de la 28e Conférence sur le Traitement
Automatique des Langues Naturelles. Volume 1 : conférence
principale, 26–36. Lille, France: ATALA.
Fu, Ruiji, Jiang Guo, Bing Qin, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. “Learning
Semantic Hierarchies via Word Embeddings.” In Proceedings of the 52nd
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers), 1199–1209. Baltimore, Maryland: Association for Computational Linguistics.
Gábor, Kata, Davide Buscaldi, Anne-Kathrin Schumann, Behrang QasemiZadeh, Haifa Zargayouna, and Thierry Charnois. 2018. “SemEval-2018
Task 7: Semantic Relation Extraction and Classification in Scientific
Papers.” In Proceedings of The 12th International Workshop on
Semantic Evaluation, 679–88. New Orleans, Louisiana.
Gladkova, Anna, Aleksandr Drozd, and Satoshi Matsuoka. 2016. “Analogy-Based
Detection of Morphological and Semantic Relations with Word Embeddings: What Works and What
Doesn’t.” In Proceedings of the NAACL Student Research
Workshop, 8–15. San Diego, California.
Grabar, Natalia, and Thierry Hamon. 2006. “Terminology
Structuring through the Derivational Morphology.” In International
Conference on Natural Language Processing (in
Finland), 652–63. Berlin Heidelberg: Springer.
Hashimoto, Kazuma, Pontus Stenetorp, Makoto Miwa, and Yoshimasa Tsuruoka. 2015. “Task-Oriented
Learning of Word Embeddings for Semantic Relation
Classification.” In Proceedings of the Nineteenth Conference on
Computational Natural Language
Learning, 268–78. Beijing, China: Association for Computational Linguistics.
Hazem, Amir, and Béatrice Daille. 2018. “Word
Embedding Approach for Synonym Extraction of Multi-Word
Terms.” In Proceedings of the Eleventh International Conference on
Language Resources and Evaluation (LREC
2018), 297–303. Miyazaki, Japan: European Language Resources Association (ELRA).
Hmida, Firas, Emmanuel Morin, and Béatrice Daille. 2015. “Extraction
de Contextes Riches En Connaissances En Corpus Spécialisés.” In Actes
de La 22e Conférence Sur Le Traitement Automatique Des Langues Naturelles. Articles
Courts, 109–15. Caen, France: ATALA.
Hou, Jiaqi, Xin Li, Haipeng Yao, Haichun Sun, Tianle Mai, and Rongchen Zhu. 2020. “Bert-Based
Chinese Relation Extraction for Public Security.” IEEE
Access 81: 132367–75.
Jameel, Shoaib, Zied Bouraoui, and Steven Schockaert. 2017. “Modeling
Semantic Relatedness Using Global Relation Vectors.” ArXiv Preprint
ArXiv:1711.05294.
Kilgarriff, Adam, Miloš Husák, Katy McAdam, Michael Rundell, and Pavel Rychlý. 2008. “GDEX:
Automatically Finding Good Dictionary Examples in a
Corpus.” In Proceedings of the XIII EURALEX International
Congress, 425–32. Barcelona, Spain: Documenta Universitaria.
Köper, Maximilian, Christian Scheible, and Sabine Schulte im Walde. 2015. “Multilingual
Reliability and ‘Semantic’ Structure of Continuous Word
Spaces.” In Proceedings of the 11th International Conference on
Computational Semantics, 40–45. London, UK: Association for Computational Linguistics.
Kudo, Taku, and John Richardson. 2018. “SentencePiece:
A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text
Processing.” In Proceedings of the 2018 Conference on Empirical
Methods in Natural Language Processing: System
Demonstrations, 66–71. Brussels, Belgium: Association for Computational Linguistics.
Lafourcade, Mathieu, and Lionel Ramadier. 2016. “Semantic
Relation Extraction with Semantic Patterns Experiment on Radiology
Reports.” In Proceedings of the Tenth International Conference on
Language Resources and Evaluation
(LREC’16), 4578–82. Portorož, Slovenia: European Language Resources Association (ELRA).
Lenci, Alessandro. 2008. “Distributional
Semantics in Linguistic and Cognitive Research.” Italian Journal of
Linguistics 20 (1): 1–31.
Lenci, Alessandro, and Giulia Benotto. 2012. “Identifying
Hypernyms in Distributional Semantic Spaces.” In *SEM 2012: The First
Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the Main Conference and the Shared Task,
and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval
2012), 75–79. Montréal, Canada: Association for Computational Linguistics.
Levy, Omer, Yoav Goldberg. 2014. “Linguistic Regularities in Sparse and Explicit Word Representations.” In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 171–80. Ann Arbor, Michigan: Association for Computational Linguistics. .
Levy, Omer, Yoav Goldberg, and Ido Dagan. 2015. “Improving
Distributional Similarity with Lessons Learned from Word Embeddings.” Transactions of the
Association for Computational
Linguistics 31: 211–25.
Levy, Omer, Steffen Remus, Chris Biemann, and Ido Dagan. 2015. “Do
Supervised Distributional Methods Really Learn Lexical Inference
Relations?” In Proceedings of the 2015 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, 970–76. Denver, Colorado: Association for Computational Linguistics.
. 2020. Lexical
Semantics for Terminology: An
Introduction. Vol. 201. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Martin, Louis, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric de la Clergerie, Djamé Seddah, and Benoît Sagot. 2020. “CamemBERT:
A Tasty French Language Model.” In Proceedings of the 58th Annual
Meeting of the Association for Computational
Linguistics, 7203–19. Online: Association for
Computational Linguistics.
Meyer, Ingrid. 2001. “Extracting
Knowledge-Rich Contexts for Terminography.” In Recent Advances in
Computational Terminology, edited by Didier Bourigault, Christian Jacquemin and Marie-Claude L’Homme, 279–302. Amsterdam / Philadelphia: John Benjamins.
Mickus, Timothee, Denis Paperno, Mathieu Constant, and Kees van Deemter. 2020. “What
Do You Mean, BERT?” In Proceedings of the Society for Computation in
Linguistics 2020, 279–90. New York, New York: Association for Computational Linguistics.
Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig. 2013. “Linguistic
Regularities in Continuous Space Word
Representations.” In Proceedings of the 2013 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, 746–51. Atlanta, Georgia: Association for Computational Linguistics.
Morin, Emmanuel, and Christian Jacquemin. 1999. “Projecting
Corpus-Based Semantic Links on a Thesaurus.” In Proceedings of the
37th Annual Meeting of the Association for Computational Linguistics on Computational
Linguistics, 389–96. USA: Association for Computational Linguistics.
Morlane-Hondère, François, and Cécile Fabre. 2012. “Le
Test de Substituabilité à l’épreuve Des Corpus: Utiliser l’analyse Distributionnelle Automatique Pour l’étude Des Relations
Lexicales.” In 3e
CMLF, 11:1001–15. France: EDP Sciences.
Paullada, Amandalynne, Bethany Percha, and Trevor Cohen. 2020. “Improving
Biomedical Analogical Retrieval with Embedding of Structural
Dependencies.” In Proceedings of the 19th SIGBioMed Workshop on
Biomedical Language Processing, 38–48. Online:
Association for Computational Linguistics.
Peters, Matthew E., Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. “Deep
Contextualized Word Representations.” In Proceedings of the 2018
Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume
1 (Long Papers), 2227–37. New Orleans, Louisiana: Association for Computational Linguistics.
Peters, Matthew E., Mark Neumann, Luke Zettlemoyer, and Wen-tau Yih. 2018. “Dissecting
Contextual Word Embeddings: Architecture and
Representation.” In Proceedings of the 2018 Conference on Empirical
Methods in Natural Language
Processing, 1499–1509. Brussels, Belgium: Association for Computational Linguistics.
Polguère, Alain. 2016. Lexicologie
et Sémantique Lexicale: Notions
Fondamentales. Montréal: Presses de l’Université de Montréal.
Qiang, Jipeng, Yun Li, Yi Zhu, Yunhao Yuan, and Xindong Wu. 2019. “A
Simple BERT-Based Approach for Lexical
Simplification.” ArXiv abs/1907.06226.
Qiao, Bo, Zhuoyang Zou, Yu Huang, Kui Fang, Xinghui Zhu, and Yiming Chen. 2022. “A
Joint Model for Entity and Relation Extraction Based on BERT.” Neural Comput.
Appl. 34 (5): 3471–81.
Roller, Stephen, Katrin Erk, and Gemma Boleda. 2014. “Inclusive
yet Selective: Supervised Distributional Hypernymy
Detection.” In Proceedings of COLING 2014, the 25th International
Conference on Computational Linguistics: Technical
Papers, 1025–36. Dublin, Ireland: Dublin City University and Association for Computational Linguistics.
Santos, Cicero Nogueira dos, Bing Xiang, and Bowen Zhou. 2015. “Classifying
Relations by Ranking with Convolutional Neural
Networks.” In Proceedings of the 53rd Annual Meeting of the
Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:
Long Papers), 626–34. Beijing, China: Association for Computational Linguistics.
Schick, Timo, and Hinrich Schütze. 2019. “Attentive
Mimicking: Better Word Embeddings by Attending to Informative
Contexts.” In Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short
Papers), 489–94. Minneapolis, Minnesota: Association for Computational Linguistics.
. 2020. “Rare
Words: A Major Problem for Contextualized Embeddings and How to Fix It by Attentive
Mimicking.” In Proceedings of the AAAI Conference on Artificial
Intelligence, 341:8766–74. New York, USA: AAAI Press.
Shi, Peng, and Jimmy Lin. 2019. “Simple
BERT Models for Relation Extraction and Semantic Role Labeling.” ArXiv Preprint
ArXiv:1904.05255.
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence
to Sequence Learning with Neural Networks.” In Proceedings of the
27th International Conference on Neural Information Processing Systems – Volume
2, 3104–12. Montreal, Canada: MIT Press.
Turney, Peter D. 2005. “Measuring Semantic Similarity
by Latent Relational Analysis.” In Proceedings of the 19th
International Joint Conference on Artificial
Intelligence, 1136–41. IJCAI’05. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Verspoor, Cornelia M., Cliff Joslyn, and George J. Papcun. 2003. “The
Gene Ontology as a Source of Lexical Semantic Knowledge for a Biological Natural Language Processing
Application.” In SIGIR Workshop on Text Analysis and Search for
Bioinformatics, 51–56. Toronto, Canada.
Vylomova, Ekaterina, Laura Rimell, Trevor Cohn, and Timothy Baldwin. 2016. “Take
and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation
Learning.” In Proceedings of the 54th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long
Papers), 1671–82. Berlin, Germany: Association for Computational Linguistics.
Weeds, Julie, Daoud Clarke, Jeremy Reffin, David Weir, and Bill Keller. 2014. “Learning
to Distinguish Hypernyms and Co-Hyponyms.” In Proceedings of COLING
2014, the 25th International Conference on Computational Linguistics: Technical
Papers, 2249–59. Dublin, Ireland: Dublin City University and Association for Computational Linguistics.
Weeds, Julie, et David Weir. 2003. «A
General Framework for Distributional Similarity». In Proceedings of
the 2003 Conference on Empirical Methods in Natural Language Processing, 81–88. EMNLP
’03. USA: Association for Computational Linguistics.
Xue, Kui, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Huanhuan Zhang, and Ping He. 2019. “Fine-Tuning
BERT for Joint Entity and Relation Extraction in Chinese Medical
Text.” In 2019 IEEE International Conference on Bioinformatics and
Biomedicine (BIBM), 892–97. San Diego, CA, USA: IEEE.
Yao, Liang, Chengsheng Mao, and Yuan Luo. 2019. “KG-BERT:
BERT for Knowledge Graph Completion.” ArXiv Preprint
ArXiv:1909.03193.
Zhang, Li, Jun Li, and Chao Wang. 2017. “Automatic
Synonym Extraction Using Word2Vec and Spectral Clustering.” In 2017
36th Chinese Control Conference
(CCC), 5629–32. Dalian, China: IEEE.
Zhou, Wangchunshu, Tao Ge, Ke Xu, Furu Wei, and Ming Zhou. 2019. “BERT-Based
Lexical Substitution.” In Proceedings of the 57th Annual Meeting of
the Association for Computational
Linguistics, 3368–73. Florence, Italy: Association for Computational Linguistics.
Cited by (2)
Cited by two other publications
L’Homme, Marie-Claude
2025. Representing multiword expressions in terminology resources. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
