An example of UDC and DDC classification: Linguistic correlates of semantic knowledge ontologies

Pawłowski, Adam; Walkowiak, Tomasz; Johnsen, Lars G.

doi:10.1075/cilt.370.14paw

In:Mathematical Modelling in Linguistics and Text Analysis: Theory and applications
Edited by Adam Pawłowski, Sheila Embleton, Jan Mačutek and Aris Xanthos
[Current Issues in Linguistic Theory 370] 2025
► pp. 161–172

Get fulltext from our e-platform

Download Book PDF

Linguistic correlates of semantic knowledge ontologies

An example of UDC and DDC classification

Adam Pawłowski | Wrocław University

Tomasz Walkowiak | Wrocław University of Technology

Lars G. Johnsen | National Library of Norway

Published online: 13 October 2025

https://doi.org/10.1075/cilt.370.14paw

Abstract

This chapter provides a comparative analysis of bibliographic corpora including titles extracted from large national bibliographies (Czech, Finnish, German, Norwegian, and Polish). From the examined corpora, subsets were obtained, corresponding to the basic categories of the DDC/UDC formal ontologies (Dewey Decimal Classification and Universal Decimal Classification). The most relevant sets of keywords were then generated from these subsets and projected onto a common semantic space using automatic translation. The study revealed the existence of common ‘European’ sets of terms, corresponding to the large semantic domains defined in the DDC/UDC ontology.

Keywords: large bibliographies, Decimal Classification, semantic domains, cross-linguistic similarities, conceptual grids

Article outline

1.Introduction
2.Research material
3.Goals and hypotheses
4.Previous research
5.Methods applied
6.Results
7.Conclusions and discussion
Notes
References

References (14)

References

Al-Sheikh Hussein, Basel. 2012. The Sapir-Whorf hypothesis today. Theory and Practice in Language Studies 2(3). 642–646. URL: [URL];

Berger, Peter L. & Thomas Luckmann. 1966. The social construction of reality: A treatise in the sociology of knowledge. Garden City, NY: Anchor Books.

Joulin, Armand, Edouard Grave, Piotr Bojanowski & Tomas Mikolov. 2017. Bag of tricks for efficient text classification. In Mirella Lapata, Phil Blunsom & Alexander Koler (eds.), Proceedings of the 15th conference of the European Chapter of the Association for Computational Linguistics, Vol. 2, 427–431. Valencia: ACL.

Kay, Paul & Willett Kempton. 1984. What is the Sapir-Whorf hypothesis? American Anthropologist 86(1). 65–79. URL: [URL].

Lakoff, George. 1987. Women, fire, and dangerous things. Chicago: University of Chicago Press.

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado & Dean Jeffrey. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th international conference on neural information processing systems, Vol. 2 (NIPS’13), 3111–3119. Red Hook (NY): Curran Associates Inc.

Moretti, Franco. 2013. Distant reading. London, New York: Verso.

Pawłowski, Adam, Elżbieta Herden & Krzysztof Topolski. 2021. Quantitative analysis of bibliographic corpora: Statistical features, semantic profiles, word spectra. In Adam Pawłowski, Jan Mačutek, Sheila Embleton & George Mikros (eds.), Language and text: Data, models, information and applications, 50–62. Amsterdam: Benjamins.

Pawłowski, Adam & Tomasz Walkowiak. 2020. Automatic recognition of gender and genre in a corpus of microtexts. In Wojciech Zamojski, Jacek Mazurkiewicz, Jarosław Sugier, Tomasz Walkowiak & Janusz Kacprzyk (eds.), Theory and applications of dependable computer systems. Proceedings of the fifteenth international conference on dependability of computer systems DepCoS-RELCOMEX, 472–481. Cham: Springer.

. 2021. Analysis of toponyms from the Polish National Bibliography. In Yasunobu Sumikawa, Ryohei Ikejiri, Antoine Doucet, Eva Pfanzelter, Mohammad Hasanuzzaman, Ian Milligan & Adam Jatowt (eds.), Proceedings of the 6th international workshop on computational history 2021) co-located with ACM/IEEE joint conference on digital libraries 2021. URL: [URL]

. 2023. Great bibliographies as a source of data for the humanities — NLP in the analysis of gender of book authors in German countries and in Poland (1801–2021). In Stefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter & Stan Szpakowicz (eds.), Proceedings of the 7th joint SIGHUM workshop on computational linguistics for cultural heritage, social sciences, humanities and literature, 63–71. Dubrovnik: ACL.

Pustejovsky, James. 2006. Lexical semantics: Overview. In Keith Brown (ed.), Encyclopedia of language & linguistics, 98–105. Amsterdam: Elsevier.

Searle, John. 1996. The social construction of reality. Harmondsworth: Penguin Books.

Straka, Milan, Jan Hajič & Jana Straková. 2016. UDPipe: Trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, POS tagging and parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 4290–4297. European Language Resources Association, Paris, France.