Article published In: Terminology: Online-First Articles
Discovering hyponymic knowledge patterns in English
Available under the Creative Commons Attribution-NonCommercial (CC BY-NC) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
This article was made Open Access under a CC BY-NC 4.0 license through payment of an APC by or on behalf of the authors.
Published online: 14 October 2025
https://doi.org/10.1075/term.25022.san
https://doi.org/10.1075/term.25022.san
Abstract
Identifying hyponymy is essential in terminology work. This article addresses the lack of a comprehensive
inventory of hyponymic knowledge patterns (KPs) in English by presenting a robust methodology for their collection. Drawing on six
complementary strategies — literature review, machine translation, parallel corpora, human translation, bootstrapping, and
generative artificial intelligence — the study identified and validated 110 distinct English hyponymic patterns, many of which had
not been previously documented. These patterns will serve to update the English version of the EcoLexicon Semantic Sketch Grammar
(ESSG-en), a KP-based tool for extracting semantic relations from corpora in Sketch Engine. The findings highlight the strengths
and limitations of each strategy and underscore the value of combining methods to achieve coverage. Ultimately, this research
fills a significant gap by delivering the most extensive list of English hyponymic patterns to date.
Keywords: knowledge patterns, hyponymy, corpus analysis
Article outline
- 1.Introduction
- 2.Hyponymy
- 3.Knowledge patterns
- 3.1From knowledge patterns to knowledge-rich contexts
- Noise
- Polysemy
- Anaphoric reference
- Non-noun elements in SRs
- Too many intervening words between the elements
- Varying degrees of certainty
- Unpredictable KPs
- Domain dependency
- Genre dependency
- KP-based approaches can be time-consuming
- Language variability
- 3.2Relation extraction by means of word sketches: The ESSG project
- 3.1From knowledge patterns to knowledge-rich contexts
- 4.Methodology
- 4.1The ESSG methodology
- 4.2Collection methods
- 4.2.1Patterns collection from literature review
- 4.2.2Patterns collection with machine translation
- 4.2.3Patterns collection with a parallel corpus
- 4.2.4Patterns collection with human translation
- 4.2.5Patterns collection by bootstrapping
- 4.2.6Patterns collection with generative AI
- 4.3Patterns validation and consolidation
- 5.Results
- 5.1Knowledge patterns collection
- 5.1.1Patterns obtained from literature review
- 5.1.2Patterns obtained with machine translation
- 5.1.3Patterns obtained with a parallel corpus
- 5.1.4Patterns obtained with human translation
- 5.1.5Patterns obtained by bootstrapping
- 5.1.6Patterns obtained with generative AI
- 5.2Pattern validation and consolidation
- 5.3Final list of patterns
- 5.4Discussion
- 5.1Knowledge patterns collection
- 6.Conclusions
- Acknowledgments
- Notes
References
References (69)
Ahmad, Khurshid, and Heather Fulford. 1992. “Knowledge
Processing: 4. Semantic Relations and Their Use in Elaborating
Terminology.” In Computing Sciences Report
CS–92–07. University of Surrey.
Aldine, Hamad Issa Alaa. 2020. “Contributions to Hypernym
Patterns Representation and Learning Based on Dependency Parsing and Sequential Pattern
Mining.” PhD Thesis, Université de Bretagne Sud.
Aussenac-Gilles, Nathalie, and Anne Condamines. 2012. “Variation
and Semantic Relation Interpretation: Linguistic and Processing Issues.” 10th Terminology and
Knowledge Engineering Conference (TKE
2012), 106–22.
Aussenac-Gilles, Nathalie, and Marie-Paule Jacques. 2008. “Designing
and Evaluating Patterns for Relation Acquisition from Texts with
Caméléon.” Terminology 14 (1): 45–73.
Barrière, Caroline. 2004a. “Building
a Concept Hierarchy from Corpus
Analysis.” Terminology 10 (2): 241–63.
. 2004b. “Knowledge-Rich
Contexts Discovery.” Seventeenth Canadian Conference on Artificial Intelligence
(AI’2004) (London, Canada) 30601: 187–201.
Barrière, Caroline, and Akakpo Agbago. 2006. “TerminoWeb:
A Software Environment for Term Study in Rich Contexts.” Proceedings of the International
Conference on Terminology, Standardisation and Technology Transfer (TSTT
2006), 103–13.
Barsalou, Lawrence W. 2010. “Ad Hoc
Categories.” In The Cambridge Encyclopedia of the Language
Sciences, edited by Patrick Colm Hogan. Cambridge University Press.
Bernier-Colborne, Gabriel, and Caroline Barrière. 2018. “CRIM
at SemEval-2018 Task 9: A Hybrid Approach to Hypernym Discovery Ere.” Proceedings of the 12th
International Workshop on Semantic Evaluation, 725–31.
Bertels, Ann. 2022. “Terminology
and Distributional Analysis of Corpora.” In Theoretical Perspectives
on Terminology: Explaining Terms, Concepts and Specialized Knowledge, edited
by Pamela Faber and Marie-Claude L’Homme. Terminology
and Lexicography Research and Practice 23. John Benjamins.
Bowden, Paul R., Peter Halstead, and Tony G. Rose. 1996. “Extracting
Conceptual Knowledge from Text Using Explicit Relation
Markers.” In Advances in Knowledge Acquisition, Proceedings of the
9th European Knowledge Acquisition Workshop, EKAW’96, edited by Jaime G. Carbonell, Jörg Siekmann, G. Goos, J. Hartmanis, and J. Leeuwen, vol. 10761, edited
by Nigel Shadbolt, Kieron O’Hara, and Guus Schreiber. Springer Berlin Heidelberg.
Bowker, Lynne. 1996. “Learning
from Cognitive Science: Developing a New Approach to Classification in
Terminology.” In EURALEX ’96
Proceedings, edited by Martin Gellerstam, Jerker Järborg, Sven-Göran Malmgren, Kerstin Norén, Lena Rogström, and Catarina Röjder Papmehl. EURALEX.
Cimiano, Philipp, Aleksander Pivk, Lars Schmidt-Thieme, and Steffen Staab. 2005. “Learning
Taxonomic Relations from Heterogeneous Sources of
Evidence.” In Ontology Learning from Text: Methods, Evaluation and
Applications, by Paul Buitelaar, Philipp Cimiano, and Bernardo Magnini, vol. 1231. IOS Press.
Cohen, Trevor, and Dominic Widdows. 2009. “Empirical
Distributional Semantics: Methods and Biomedical Applications.” Journal of Biomedical
Informatics 42 (2): 390–405.
Condamines, Anne. 2000. “‘Chez’
dans un corpus de sciences naturelles : un marqueur de relation méronymique?” Cahiers de
lexicologie 2 (77): 165–87.
. 2002. “Corpus
Analysis and Conceptual Relation
Patterns.” Terminology 8 (1): 141–62.
. 2008. “Taking
Genre into Account When Analysing Conceptual Relation
Patterns.” Corpora 3 (2): 115–40.
. 2017. “Terminological
Knowledge Bases.” In The Routledge Handbook of
Lexicography, edited by Pedro A. Fuertes-Olivera. Routledge.
. 2022. “How
the Notion of ‘Knowledge Rich Context’ Can Be Characterized Today.” Frontiers in
Communication 71.
Condamines, Anne, and Josette Rebeyrolle. 2001. “Searching
for and Identifying Conceptual Relationships via a Corpus-Based Approach to a Terminological Knowledge Base
(CTKB).” In Recent Advances in Computational
Terminology, edited by Didier Bourigault, Christian Jacquemin, and Marie-Claude L’Homme. Natural
Language Processing 2.
Drouin, Patrick. 2010. “Extracting
a Bilingual Transdisciplinary Scientific Lexicon.” In eLexicography
in the 21st Century: New Challenges, New Applications, edited by Sylviane Granger and Magali Paquot. Presses Universitaires de Louvain.
Faber, Pamela, Pilar León-Araúz, and Juan Antonio Prieto Velasco. 2009. “Semantic
Relations, Dynamicity, and Terminological Knowledge Bases.” Current Issues in Language
Studies 11: 1–23.
Gillam, Lee, Mariam Tariq, and Khurshid Ahmad. 2005. “Terminology
and the Construction of
Ontology.” Terminology 11 (1): 55–81.
Halskov, Jakob, and Caroline Barrière. 2008. “Web-Based
Extraction of Semantic Relation Instances for Terminology
Work.” Terminology 14 (1): 20–44.
Hearst, Marti A. 1992. “Automatic Acquisition of
Hyponyms from Large Text Corpora.” Proceedings of the Fourteenth International Conference on
Computational Linguistics
(COLING’92) 21: 539–45.
Jakubíček, Miloš, Adam Kilgarriff, Diana McCarthy, and Pavel Rychlý. 2010. “Fast
Syntactic Searching in Very Large Corpora for Many Languages.” Proceedings of the 24th Pacific
Asia Conference on Language, Information and
Computation, 741–47.
Kabir, Md. Ahsanul, Tyler Phillips, Xiao Luo, and Mohammad Al Hasan. 2023. “ASPER:
Attention-Based Approach to Extract Syntactic Patterns Denoting Semantic Relations in Sentential
Context.” Natural Language Processing
Journal 31.
Kilgarriff, Adam, Vít Baisa, Jan Bušta, et al. 2014. “The
Sketch Engine: Ten Years
On.” Lexicography 1 (1): 7–36.
Lakoff, George. 1987. Women,
Fire, and Dangerous Things: What Categories Reveal about the Mind. University of Chicago Press.
Laurence, Stephen, and Eric Margolis. 1999. “Concepts
and Cognitive Science.” In Concepts: Core
Readings, edited by Eric Margolis and Stephen Laurence. MIT Press.
Lefeuvre, Luce, Kevin Coustot, Anne Condamines, and Josette Rebeyrolle. 2017. MAR-REL :
Liste de candidats-marqueurs français pour les relations d’hyperonymie, de méronymie et de
cause. CLLE-ERSS.
Lenci, Alessandro. 2018. “Distributional
Models of Word Meaning.” Annual Review of
Linguistics 41: 151–71.
León-Araúz, Pilar, and Pamela Faber. 2010. “Natural
and Contextual Constraints for Domain-Specific
Relations.” In Proceedings of the Workshop Semantic Relations. Theory
and Applications, edited by Verginica Barbu Mititelu, Viktor Pekar, and Eduard Barbu. Valletta.
León-Araúz, Pilar, and Antonio San Martín. 2018. “The
EcoLexicon Semantic Sketch Grammar: From Knowledge Patterns to Word
Sketches.” In Proceedings of the LREC 2018 Workshop “Globalex 2018 —
Lexicography & WordNets,” edited by Ilan Kerneman and Simon Krek. Globalex.
León-Araúz, Pilar, Antonio San Martín, and Pamela Faber. 2016. “Pattern-Based
Word Sketches for the Extraction of Semantic
Relations.” In Proceedings of the 5th International Workshop on
Computational Terminology, edited by Patrick Drouin, Natalia Grabar, Thierry Hamon, Kyo Kageura, and Koichi Takeuchi. Osaka.
Lezama-Sánchez, Ana Laura, Mireya Tovar Vidal, and José A. Reyes-Ortiz. 2022. “An
Approach Based on Semantic Relationship Embeddings for Text
Classification.” Mathematics 10 (21): 4161.
Liu, Chunhua, Trevor Cohn, and Lea Frermann. 2023. “Seeking
Clozure: Robust Hypernym Extraction from BERT with Anchored Prompts.” Proceedings of the 12th
Joint Conference on Lexical and Computational Semantics (*SEM
2023), 193–206.
Madsen, Bodil Nistrup, Bolette Sandford Pedersen, and Hanne Erdman Thomsen. 2001. “Defining
Semantic Relations for OntoQuery.” Ontology-Based Interpretation of Noun Phrases. Proceedings
of the First International OntoQuery
Workshop, 57–88.
Maia, Belinda, and Sérgio Matos. 2008. “Corpógrafo
V.4 — Tools for Researchers and Teachers Using Comparable
Corpora.” In Proceedings of LREC 2008 Workshop on Comparable
Corpora, edited by Pierre Zweigenbaum, Éric Gaussier, and Pascale Fung. Language Resources Evaluation Conference.
Marshman, Elizabeth. 2006. “Lexical
Knowledge Patterns for Semi-Automatic Extraction of Cause–Effect and Association Relations from Medical Texts: A Comparative
Study of English and French.” PhD Thesis, Université de Montréal.
. 2008. “Expressions
of Uncertainty in Candidate Knowledge-Rich Contexts: A Comparison in English and French Specialized
Texts.” Terminology 14 (1): 124–51.
. 2014. “Enriching
Terminology Resources with Knowledge-Rich Contexts: A Case
Study.” Terminology 20 (2): 225–49.
. 2022. “Knowledge
Patterns in Corpora.” In Theoretical Perspectives on Terminology:
Explaining Terms, Concepts and Specialized Knowledge, edited by Pamela Faber and Marie-Claude L’Homme. Terminology
and Lexicography Research and Practice 23.
Marshman, Elizabeth, Julie L. Gariépy, and Clarissa Harms. 2012. “Helping
Language Professionals Relate to Terms: Terminological Relations and Termbases.” The Journal of
Specialised Translation 181: 30–56.
Meyer, Ingrid. 2001. “Extracting
Knowledge-Rich Contexts for Terminography — A Conceptual and Methodological
Framework.” In Recent Advances in Computational
Terminology, edited by Didier Bourigault, Christian Jacquemin, and Marie-Claude L’Homme. Natural
Language Processing 2. John Benjamins.
Meyer, Ingrid, Lynne Bowker, and Karen Eck. 1992. “COGNITERM:
An Experiment in Building а Terminological Knowledge Base.” Proceedings of the Fifth EURALEX
International Congress (EURALEX ’92), 159–72.
Meyer, Ingrid, Karen Eck, and Douglas Skuce. 1997. “Systematic
Concept Analysis within a Knowledge-Based Approach to
Terminology.” In Handbook of Terminology
Management, edited by Sue Ellen Wright and Gerhard Budin, Volume
1: Basic Aspects of Terminology Management. John Benjamins.
Mititelu, Verginica Barbu. 2006. “Automatic Extraction of
Patterns Displaying Hyponym-Hypernym Co-Occurrence from Corpora.” Paper presented
at CESCL, Budapest, Hungary. Proceedings of
the First CESCL.
Morin, Emmanuel. 1998. “Prométhée :
un outil d’aide à l’acquisition de relation sémantiques entre
termes.” In Proceedings of TALN 1998, edited
by Pierre Zweigenbaum. ATALA.
. 1999. “Acquisition
de patrons lexico-syntaxiques caractéristiques d’une relation sémantique.” Traitement
automatique des
langues 40 (1): 143–66.
Nuopponen, Anita. 2022. “Conceptual
Relations.” In Theoretical Perspectives on Terminology: Explaining
Terms, Concepts and Specialized Knowledge, edited by Pamela Faber and Marie-Claude L’Homme. Terminology
and Lexicography Research and Practice 23. John Benjamins.
Pearson, Jennifer. 1998. Terms
in Context. John Benjamins.
Roller, Stephen, Douwe Kiela, and Maximilian Nickel. 2018. “Hearst
Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora.” Proceedings of the
56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short
Papers), 358–63.
Rosch, Eleanor. 1978. “Principles
of Categorization.” In Cognition and
Categorization, edited by Eleanor Rosch and Barbara Bloom Lloyd. no. 1. Lawrence Erlbaum Associates.
Rosch, Eleanor, Carolyn B. Mervis, Wayne D. Gray, David M. Johnson, and Penny Boyes-Braem. 1976. “Basic
Objects in Natural Categories.” Cognitive
Psychology 8 (3): 382–439.
San Martín, Antonio. 2022. “A
Flexible Approach to Terminological Definitions: Representing Thematic
Variation.” International Journal of
Lexicography 35 (1): 53–74.
San Martín, Antonio, Catherine Trekker, and Juan Carlos Díaz-Bautista. 2023. “Extracting
the Agent-Patient Relation from Corpus With Word Sketches.” Proceedings of the 4th Conference
on Language, Data and Knowledge (Vienna, Austria), 666–75. [URL]
San Martín, Antonio, Catherine Trekker, and Pilar León-Araúz. 2022. “Repérage
automatisé de l’hyponymie dans des corpus spécialisés en français à l’aide de Sketch
Engine.” Terminology 28 (2): 264–98.
Seitner, Julian, Christian Bizer, Kai Eckert, et al. 2016. “A
Large Database of Hypernymy Relations Extracted from the Web.” Proceedings of the 10th
Conference on Language Resources and Evaluation
(LREC-16), 360–67.
Sloutsky, Vladimir M. 2003. “The Role of Similarity in the
Development of Categorization.” Trends in Cognitive
Sciences 7 (6): 246–51.
Snow, Rion, Daniel Jurafsky, and Andrew Y. Ng. 2004. “Learning
Syntactic Patterns for Automatic Hypernym Discovery.” Advances in Neural Information Processing
Systems 171 171: 1297–304.
Steinberger, Ralf, Andreas Eisele, Szymon Klocek, Spyridon Pilos, and Patrick Schlüter. 2012. “DGT-TM:
A Freely Available Translation Memory in 22
Languages.” In Proceedings of the Eighth International Conference on
Language Resources and Evaluation (LREC’12), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, et al. European Language Resources Association (ELRA). [URL]
Van Campenhoudt, Marc. 2004. “Réseau
sémantique et approche componentielle des bases de données lexicales
multilingues.” International Journal of
Lexicography 17 (2): 155–60.
Yun, Geonil, Yongjae Lee, A-Seong Moon, and Jaesung Lee. 2023. “Hypert:
Hypernymy-Aware BERT with Hearst Pattern Exploitation for Hypernym Discovery.” Journal of Big
Data 10 (1): 141.