In:Mathematical Modelling in Linguistics and Text Analysis: Theory and applications
Edited by Adam Pawłowski, Sheila Embleton, Jan Mačutek and Aris Xanthos
[Current Issues in Linguistic Theory 370] 2025
► pp. 43–59
Simple stochastic processes behind Menzerath’s Law
Published online: 13 October 2025
https://doi.org/10.1075/cilt.370.04mil
https://doi.org/10.1075/cilt.370.04mil
Abstract
This paper revisits Menzerath’s Law, also known as the Menzerath-Altmann Law, which models a relationship
between the length of a linguistic construct and the average length of its constituents. Recent findings indicate that simple
stochastic processes can display Menzerathian behaviour, though existing models fail to accurately reflect real-world
data.
If we adopt the basic principle that a word can change its length in both syllables and phonemes, where the
correlation between these variables is not perfect and these changes are of a multiplicative nature, we get a bivariate
log-normal distribution. The present paper shows, that from this very simple principle, we obtain the classic Altmann model of
the Menzerath-Altmann Law.
If we model the joint distribution separately and independently from the marginal distributions, we can
obtain an even more accurate model by using a Gaussian copula. The models are confronted with empirical data, and alternative
approaches are discussed.
Article outline
- 1.Introduction
- 2.Explanation of Altmann’s classical model by bivariate
log-normal distribution - 3.Using a Gaussian copula
- 4.Using segment boundaries instead of segments
- 5.Conclusions
- Data availability
Acknowledgements References
References (17)
Altmann, Gabriel. 1980. Prolegomena
to Menzerath’s law. In Rüdiger Grotjahn (ed.), Glottometrika 2, 1–10. Bochum: Brockmeyer.
Benešová, Martina & Čech, Radek. 2015. Menzerath-Altmann
law versus random model. In George K. Mikros & Ján Mačutek (eds.), Sequences
in language and
text, 57–70. Berlin: de Gruyter.
Galton, Francis. 1886. Regression
towards mediocrity in hereditary stature. The Journal of the Anthropological Institute
of Great Britain and
Ireland 15. 246–263.
Grzybek, Peter. 2007. History
and methodology of word length studies: The state of the
art. In Peter Grzybek (ed.), Contributions
to the science of text and
language, 15–90. Dordrecht: Springer.
Inouye, David I., Eunho Yang, Genevera I. Allen & Pradeep Ravikumar. 2017. A
review of multivariate distributions for count data derived from the Poisson
distribution. Wiley Interdisciplinary Reviews: Computational
Statistics 9(3). e1398.
Köhler, Reinhard. 1993. Synergetic
linguistics. In Reinhard Köhler & Burghard B. Rieger. (eds), Contributions
to quantitative linguistics: Proceedings of the first international conference on quantitative linguistics, QUALICO,
Trier,
1991, 41–51. Dordrecht: Springer.
Kułacka, Agnieszka. 2010. The
coefficients in the formula for the Menzerath-Altmann law. Journal of Quantitative
Linguistics 17(4). 257–268.
Lakshminarayana, J., S. N. N. Pandit & K. Srinivasa Rao. 1999. On
a bivariate Poisson distribution. Communications in Statistics — Theory and
Methods 28(2). 267–276.
Mikros, Georgios & Jiří Milička. 2014. Distribution
of the Menzerath’s law on the syllable level in Greek
texts. In Gabriel Altmann, Radek Čech, Ján Mačutek & Ludmila Uhlířová (eds.). Empirical
approaches to text and language
analysis, 180–189. Lüdenscheid: RAM-Verlag.
Milička, Jiří. 2014. Menzerath’s
law: the whole is greater than the sum of its parts. Journal of Quantitative
Linguistics 21(2). 85–99.
. 2015. Teorie komunikace jakožto explanatorní princip přirozené víceúrovňové segmentace
textů [The theory of communication as an explanatory principle for
the natural multilevel text segmentation]. PhD
Thesis, Faculty of Arts, Charles University, Prague. Available
from: [URL]
Motalová, Tereza. 2022. Menzerath-Altmann
law in Chinese. PhD Thesis, Faculty of Arts, Palacký University, Olomouc. Available
from: [URL]
Torre, Iván G., Bartolo Luque, Lucas Lacasa, Christopher T. Kello & Antoni Hernández-Fernández. 2019. On
the physical origin of linguistic laws and lognormality in speech. Royal Society Open
Science 6(8). 191023.
