Article published In: International Journal of Corpus Linguistics
Vol. 27:2 (2022) ► pp.220–247
Degrees of non-standardness
Feature-based analysis of variation in a Torlak dialect corpus
Published online: 20 May 2022
https://doi.org/10.1075/ijcl.20014.vuk
https://doi.org/10.1075/ijcl.20014.vuk
Abstract
A corpus-based method for assessing a range of dialect-standard variation is presented for identifying samples
exhibiting the highest prevalence of dialect features. This method provides insight into areal and inter-speaker variation and
allows the extraction of maximally non-standard manifestations of the dialect, which may then be sampled and used for the study of
language change and variation. The focus is on a non-standard Torlak variety, which has undergone considerable change under the
influence of standard Serbian. The degree of variation is assessed by measuring the frequencies of five distinguishing linguistic
features: accent position, dative reflexive si, auxiliary omission in the compound perfect, the post-positive
article, and analytic case marking in the indirect object and possessive. Locations subject to the greatest and least influence of
the standard are revealed using hierarchical clustering. A positive correlation between the frequencies of occurrence reveals
which non-standard feature is the best predictor of the others.
Article outline
- 1.Introduction
- 2.Variation in Torlak
- 2.1Dimensions of variation
- 2.2Assessing variation
- 3.Torlak features chosen for analysis
- 3.1Selection
- 3.2Accent position
- 3.3The clitic si
- 3.4Omission of 3rd person auxiliary with l-perfect
- 3.5Post-positive article
- 3.6Analytic dative marking of the possessive and indirect object
- 3.7Operationalization
- 4.The Timok sample
- 5.Measuring variation
- 5.1Analysis
- 5.2Results
- 6.Discussion
- 7.Conclusion
- Acknowledgements
- Notes
References
References (46)
Arsenijević, B. (2012). Evaluative
reflexions: Evaluative dative reflexive in Southeast
Serbo-Croatian. In B. Fernandez & R. Etxepare (Eds.), Variation
in Datives: A Microcomparative
Perspective (pp. 1–21). Oxford University Press.
Belić, A. (1905). Dijalekti istočne i južne Srbije [Dialects of Eastern and Southern
Serbia]. Srpska Kraljevska Akademija.
Bruland, I., & Carr, P. (2013). Variability,
unconscious accent adaptation and sense of identity: The case of RP influences on speakers of Standard Scottish
English. Language
Sciences, 391, 151–155.
Erjavec, T. (2012). MULTEXT-East:
Morphosyntactic resources for Central and Eastern European languages. Language Resources &
Evaluation, 461, 131–142.
Escher, A. (2021). Double
argument marking in Timok dialect texts (in Balkan Slavic context). Zeitschrift für
Slawistik, 66(1), 61–90.
Frleta, T. (2010). Uporaba i značenje nenaglašenog dativa povratne zamjenice u hrvatskom
jeziku [The Use and the Meaning of Un-accentuated Dative Reflexive Pronoun
in Croatian Language]. Jezik: časopis za kulturu hrvatskoga književnog
jezika, 57(1), 1–13.
Grickat, I. (1954). O perfektu bez pomoćnog glagola u srpskohrvatskom jeziku [On the
perfect tense without auxiliary in Serbo-Croatian Language]. Srpska Akademija Nauka.
Hinrichs, U. (1999). Die sogenannten Balkanismen als Problem der Südosteuropa Linguistik und der Allgemeinen
Sprachwissenschaft [The So-called Balkanisms as a Problem of Southeast
European Linguistics and General Linguistics]. In U. Hinrichs (Ed.), Handbuch der Süosteuropa-Linguistik (pp. 42–463). Harrassowitz.
Ivanova, E. Y., & Gradinarova, A. A. (2015). Sintaksicheskaya sistema bolgarskogo yazyka na fone russkogo [The
Syntactic System of the Bulgarian Language on the Basis of the Russian Language]. Yazyki slavyanskoj kultury.
Ivić, P. (1985). Dijalektologija srpskohrvatskog jezika. Uvod i štokavsko narečje [Dialectology of the Serbo-Croatian Language. Introduction and the Neo-Shtokavian
dialects]. Matica srpska.
(2009). Srpski dijalekti i njihova klasifikacija [Serbian Dialects and Their
Classification]. Izdavachka knizharnitza Zorana Stojanovicha.
Joseph, B. (1992). The Balkan Languages. In W. Bright (Ed.), International Encyclopedia of Linguistics (Vol. 11, pp. 153–155). Oxford University Press.
Krstić, D. (2014). Konstrukcija identiteta Torlaka u Srbiji i Bugarskoj [The
Construction of the Torlak Identity in Serbia and Bulgaria] [Doctoral
dissertation]. Univerzitet u Beogradu.
Lindstedt, J. (2000). Linguistic
balkanization: Contact-induced change by mutual
reinforcement. In D. Gilbers, J. Nerbonne, & S. Schaeken (Eds.), Languages
in Contact: Studies in Slavic and General
Linguistics (pp. 231–246). Rodopi.
Ljubešić, N., Klubička, F., Agić, Ž., & Jazbec, I. (2016). New
inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and
Serbian. In N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings
of the Tenth International Conference on Language Resources and Evaluation: LREC
2016 (pp. 4264–4270). European Language Resources Association.
Meermann, A., & Sonnenhauser, B. (2016). Das Perfekt im Serbischen zwischen Slavischer und Balkanslavischer
Entwicklung [The Perfect in Serbian Between Slavic and Balkan Slavic
Development]. In A. Bazhutkina & S. Barbara (Eds.), Linguistische
Beiträge zur Slavistik. XXII. Jungslavistlnnen-Treffen in München. 12. Bis 14. September
2013 (pp. 83–110). Biblion.
Miličević Petrović, M., Vuković, T., Mirić, M., Konior, D., & Escher, A. (forthcoming). Language
Documentation II: Towards a sociolinguistic corpus of Torlak. Challenges for data
processing. Zeitschrift für Slavische
Philologie. Winter.
Mišeska-Tomić, O. (2004). The
Balkan Sprachbund Properties. In O. Mišeska Tomić (Ed.), Balkan
Syntax and
Semantics (pp. 1–55). John Benjamins.
Mitkovska, L. (2011). Competition
between nominal possessive constructions and the possessive dative in
Macedonian. In M. Nomachi (Ed.), The
Grammar of Possessivity in South Slavic languages and Diachronic
Perspective (pp. 83–109). Slavic Research Center.
Nerbonne, J., & Kretzschmar, W. A. (2012). Dialectometry
++. LLC: Journal of Digital Scholarship in the
Humanities, 28(1), 2–12.
Petrova, G. (2014). Medialny glagoly s refleksivna semantika [Medial Verbs with
Reflexive Semantics]. Nauchny trudove na Rusenskyja
universitet, 53(6.3), 36–40.
Petrović, T. (2015). Srbija i njen jug. “Južnjački dijalekti” između jezika, kulture I politike [Serbia and its South. “The Southern Dialects” between Language, Culture and
Politics]. Fabrika knjiga.
Plotnikova, A. A. (1996). Materialy dlja etnolinggvisticheskogo izuchenija balkanoslavjanskogo areala [Materials for the Ethnolinguistic study of the Balkan Slavic
Area]. Institut slavjanovedenija RAN.
R Core Team. (2019). R: A language and
environment for statistical computing (Version 3.6.2) [Computer
software]. R Foundation for Statistical Computing. [URL]
Savova, D. (2017). Glagoli s elementa si/sobie v balgarskija i v polskija ezik [Verbs with se/sobie elements in Bulgarian and Polish language]. Zeszyty
Cyrylo-Metodiańskie, 61, 38–56.
Schmidt, T. (2009). Creating
and working with spoken language corpora in EXMARaLDA. In V. Lyding (Ed.), Proceedings
of the Second Colloquium on Lesser Used Languages & Computer
Linguistics (pp. 151–164). EURAC research.
Sikimić, B. (2012). Timski terenski rad Balkanološkog instituta SANU. Razvoj istraživačkih ciljeva i
metoda [Team Fieldwork of the Institute for Balkan Studies of SASA. The
Development of the Research Goals and Methods]. In. M. Ivanović-Barušić (Ed.), Terenska istraživanja – poetika susreta (pp. 167–198). Etnografski institut SANU.
Szmrecsanyi, B. (2015). Grammatical
Variation in British English Dialects: A Study in Corpus-Based Dialectometry. Cambridge University Press.
(2017). Variationist
sociolinguistics and corpus-based variationist linguistics: Overlap and cross-pollination
potential. Canadian Journal of Linguistics/Revue canadienne de
linguistique, 62(4), 1–17.
Szmrecsanyi, B., & Wälchli, B. (Eds.) (2014). Aggregating
Dialectology, Typology and Register Analysis: Linguistic Variation in Text and Speech. Walter de Gruyter.
Sobolev, A. N. (1998). Sprachatlas Ostserbiens und Westbulgariens. Bd. III. Texte [Linguistic Atlas of East Serbia and West Bulgaria. Volume III.
Texts]. Biblion.
(2003). Malyj dialektologicheskij atlas balkanskih jazykov. Probnyj vypusk [Little Dialectological Atlas of the Balkan Languages. Trial
issue]. Biblion.
Sobolev, A. (2008). From
synthetic to analytic case: Variation in South-Slavic
dialects. In A. Malchukov, & A. Spencer (Eds.), The
Oxford Handbook of
Case (pp. 716–729). Oxford University Press.
Stanojević, M. (1911). Severno-timočki dijalekat [The Northern Timok
Dialect]. Srpski dijalektološki
zbornik, 21, 360–463.
Vuković, T. (2019). Torlak
ReLDI Tagger 2019 [Computer
software]. Retrieved November 1,
2021, from [URL]
(2020). Spoken
Torlak dialect corpus 1.0. CLARIN.SI. [URL]
(2021). Representing
variation in a spoken corpus of an endangered dialect: The case of Torlak. Language Resources
&
Evaluation, 551, 731–756.
Vuković, T., Mirić, M., Escher, A., Ćirković, S., Miličević Petrović, M., Sobolev, A., & Sonnenhauser, B. (forthcoming). Under
the magnifying glass. Dimensions of variation in the contemporary Timok variety. Zeitschrift
für Slavische Philologie. Winter.
Vuković, T., & Samardžić, T. (2018). Prostorna raspodela frekvencije postpozitivnog člana u timočkom govoru [Areal distribution of the frequency of the post-positive article in the Timok vernacular]. In S. Ćirković, A. N. Sobolev, B. Sonnenhauser, M. Miličević, & J. Pandurević, (Eds.), Timok.
Folkloristička i lingvistička terenska istraživanja
2015–2017 (pp. 181–200). Narodna biblioteka “Njegoš”.
Wahlström, M. (2015). The Loss of Case Inflection in Bulgarian and Macedonian. Slavica Helsingiensia 47 [Doctoral dissertation, University of Helsinki]. University of Helsinki, Department of Modern Languages. [URL]
