The most frequent statistics in corpus linguistics are frequencies of occurrence and frequencies of co-occurrence of two or more linguistic variables. However, such frequencies in isolation may sometimes be misleading since they do not take into consideration the degree of dispersion of the relevant linguistic variable. Many dispersion measures and adjusted frequency measures have been suggested but are neither widely known nor applied. Another unfortunate aspect of such measures is that many also come with a variety of problems. I pursue three objectives with this article. First, I want to raise awareness of this issue and make the available measures more widely known, so I present an overview of many measures of dispersion and adjusted frequencies. Second, I propose a conceptually simple alternative measure, DP, explain and exemplify it, and compare it to previously discussed measures. Third and most importantly, I urge corpus linguists to explore the notion of dispersion in more detail and outline a few proposals which steps to take next.
2026. Metadiscoursal adjectives in novice academic writing. English for Specific Purposes 82 ► pp. 16 ff.
Basile, Rodolfo
2025. The Finnish Existential Partitive Construction: Comparing Two Applications of Collostructional Analysis. In How to Do Things with Corpora [Linguistik in Empirie und Theorie/Empirical and Theoretical Linguistics, ], ► pp. 67 ff.
Baumann, Andreas
2025. Measures of Diachronic Change. In Reference Module in Social Sciences,
Cox, Ashleigh, Daniel Dixon & Tülay Dixon
2025. Vocabulary list generator: A digital tool to generate frequency-based word lists adjusted for dispersion. Research Methods in Applied Linguistics 4:1 ► pp. 100180 ff.
Feltgen, Quentin
2025. Quantitative Approaches to Grammaticalization. Histoire Épistémologie Langage 47-1
Fioravanti, Irene
2025. Connecting Corpus Linguistics and Psycholinguistics. In Exploration of the Intersection of Corpus Linguistics and Language Science, ► pp. 1 ff.
Florescu, Cosmin Mihail & Ryosuke L. Ohniwa
2025. On the replicability of corpus-derived medical word lists. Applied Corpus Linguistics 5:2 ► pp. 100130 ff.
Florescu, Cosmin Mihail & Ryosuke L. Ohniwa
2025. On the Creation of a Corpus-Derived Medical Multi-Word Term List. Information 16:2 ► pp. 118 ff.
Geluso, Joe, Hui-Hsien Feng & Randy Appel
2025. The unit of analysis in learner corpus research on formulaic language. Applied Corpus Linguistics 5:1 ► pp. 100123 ff.
Gimenes, Manuel, Eric Lambert, Louise Chaussoy, Maximiliano A. Wilson & Pauline Quémart
2025. VOC-ADO: A lexical database for French-speaking adolescents. Behavior Research Methods 57:5
Gong, Tongxi, Lei Liu, Jianjun Shi & Yi Guo
2025. A meaning-based academic vocabulary list. Journal of English for Academic Purposes 77 ► pp. 101557 ff.
Hansen, Sandra & Thilo Weber
2025. 79Diskurs, Grammatik, Korpus: Auf dem Weg zu einer Korpusgrammatik des Deutschen. In Diskursgrammatik, ► pp. 79 ff.
Hilpert, Martin
2025. Frequency. In The Cambridge Handbook of Construction Grammar, ► pp. 149 ff.
Huang, Ping-Yu
2025. A frequency, coverage, and dispersion analysis of the academic collocation list in university student writing. International Review of Applied Linguistics in Language Teaching 63:1 ► pp. 617 ff.
2025. Real-time monitoring of streaming text data by integrating text visualization techniques and natural language processing. International Journal of Data Science and Analytics 20:5 ► pp. 4757 ff.
Qian, Yubin & Nan Wu
2025. Cultural Interdiscursivity in Managing Corporate Discourse: A Corpus Analysis. Journal of Psycholinguistic Research 54:1
Reynolds, Barry Lee
2025. Target Words. In Researching Second Language Incidental Vocabulary Acquisition through Reading [Springer Texts in Education, ], ► pp. 109 ff.
2025. Intersecting factors of disadvantage and discrimination and their effect on daily life during the coronavirus pandemic: the CICADA-ME mixed-methods study. Health and Social Care Delivery Research► pp. 1 ff.
Rojo, Guillermo
2025. Diccionario de frecuencias léxicas derivado del CORPES. Revista de Lexicografía 30 ► pp. 133 ff.
Soenning, Lukas
2025. ,
Su, Ruili & Yanfei Zhang
2025. Conversational implicature: a diachronic cognitive pragmatic approach. Humanities and Social Sciences Communications 12:1
Vukojević, Iva, Irina Masnikosa, Matej Gjurković, Nina Drobac, Ana Butković, Martina Lozić, Denis Bratko & Jan Šnajder
2025. Personality adjectives in the digital world: A natural language processing study of Big Five adjectives and their usage on Reddit. Journal of Research in Personality 118 ► pp. 104634 ff.
Wang, Ying
2025. “Guided by the science”: a keyword analysis of government ministers’ and scientists’ stance in the UK government’s COVID-19 press briefings. Text & Talk 45:3 ► pp. 413 ff.
Wang, Ying, Henrik Kaatari, Tove Larsson, Hongping Xiong & Fei Liu
2025. Introducing the Chinese Learner English Corpus (CLEC). Studies in Second Language Acquisition► pp. 1 ff.
Zhan, Hongwei
2025. Key cluster identification in literary texts using and comparing multiple measures: an exploratory comparative study and its implications. Digital Scholarship in the Humanities 40:2 ► pp. 668 ff.
Zou, Yue & Hao Lin
2025. A basic General Service List for Chinese Sign Language. Journal of Deaf Studies and Deaf Education 30:3 ► pp. 405 ff.
Brysbaert, Marc, Gonzalo Martínez & Pedro Reviriego
2024. Moving beyond word frequency based on tally counting: AI-generated familiarity estimates of words and phrases are an interesting additional index of language knowledge. Behavior Research Methods 57:1
Curado Fuentes, Alejandro
2024. Multilingualism and Specialized Languages: A Keyword-Based Approach to Research Publications. In Multilingualism in Its Multiple Dimensions,
Finlayson, Natalie, Emma Marsden & Rachel Hawkes
2024. Creating and evaluating corpus-informed word lists for adolescent, beginner-to-low-intermediate learners of French, German, and Spanish. Language Teaching Research
Flor, Michael, Steven Holtzman, Paul Deane & Isaac Bejar
2024. Justification: Insights from Corpora. Episteme 21:3 ► pp. 794 ff.
Hashimoto, Brett & Kyra Nelson
2024. Recent trends in corpus design and reporting: A methodological synthesis. Research in Corpus Linguistics 12:1 ► pp. 59 ff.
Hou, Zhuohan, Vahid Aryadoust & Azrifah Zakaria
2024. Investigating the visual content of a commercialized academic listening test: Implications for validity. Applied Corpus Linguistics 4:3 ► pp. 100109 ff.
2024. Infant‐directed communication: Examining the many dimensions of everyday caregiver‐infant interactions. Developmental Science 27:5
Sönning, Lukas
2024. Evaluation of keyness metrics: performance and reliability. Corpus Linguistics and Linguistic Theory 20:2 ► pp. 263 ff.
Sönning, Lukas
2025. Advancing our understanding of dispersion measures in corpus research. Corpora 20:1 ► pp. 3 ff.
Van Hoey, Thomas, Xiaoyu Yu, Tung-Le Pan & Youngah Do
2024. What ratings and corpus data reveal about the vividness of Mandarin ABB words. Language and Cognition 16:4 ► pp. 1674 ff.
Yolanda, Y. & S. T. Purnani
2024. THE 2ND INTERNATIONAL CONFERENCE OF MATHEMATICS EDUCATION, LEARNING, AND APPLICATION [THE 2ND INTERNATIONAL CONFERENCE OF MATHEMATICS EDUCATION, LEARNING, AND APPLICATION, 3148], ► pp. 030001 ff.
Özer, Mustafa & Erdem Akbaş
2024. Assembling a justified list of academic words in veterinary medicine: The veterinary medicine academic word list (VMAWL). English for Specific Purposes 74 ► pp. 29 ff.
Fitzgerald, Jill, Jackie Eunjung Relyea, Jeff Elmore & James S. Kim
2023. Academic Vocabulary in First-Grade Children’s Compositions: An Exploration. In The Hitchhiker's Guide to Writing Research [Literacy Studies, 25], ► pp. 75 ff.
Gillings, Mathew, Gerlinde Mautner & Paul Baker
2023. Corpus-Assisted Discourse Studies,
GRIES, STEFAN T.
2023. New Technologies and Advances in Statistical Analysis in Recent Decades. In The Handbook of Usage‐Based Linguistics, ► pp. 561 ff.
Kyröläinen, Aki-Juhani & Veronika Laippala
2023. Predictive keywords: Using machine learning to explain document characteristics. Frontiers in Artificial Intelligence 5
Nelson, Robert N.
2023. Too Noisy at the Bottom: Why Gries’ (2008, 2020) Dispersion Measures Cannot Identify Unbiased Distributions of Words. Journal of Quantitative Linguistics 30:2 ► pp. 153 ff.
Nelson, Robert N.
2025. Groundhog Day is Not a Good Model for Corpus Dispersion. Journal of Quantitative Linguistics 32:2 ► pp. 103 ff.
Sun, Qiyu, Fang Chen & Shengkai Yin
2023. The role and features of peer assessment feedback in college English writing. Frontiers in Psychology 13
Ungerer, Tobias & Stefan Hartmann
2023. Constructionist Approaches,
Van Hoey, Thomas
2023. ABB, a salient prototype of collocate–ideophone constructions in Mandarin Chinese. Cognitive Linguistics 0:0
Vitta, Joseph P., Christopher Nicklin & Simon W. Albright
2023. Academic word difficulty and multidimensional lexical sophistication: An English‐for‐academic‐purposes‐focused conceptual replication of Hashimoto and Egbert (2019). The Modern Language Journal 107:1 ► pp. 373 ff.
Wan, Minyu, Qi Su, Rong Xiang & Chu-Ren Huang
2023. Data-driven analytics of COVID-19 ‘infodemic’. International Journal of Data Science and Analytics 15:3 ► pp. 313 ff.
Xia, Detong, Yudi Chen & Hye K. Pae
2023. Lexical and grammatical collocations in beginning and intermediate L2 argumentative essays: a bigram study. International Review of Applied Linguistics in Language Teaching 61:4 ► pp. 1421 ff.
ZAHLER, SARA
2023. Some Issues in Usage‐Based Methods. In The Handbook of Usage‐Based Linguistics, ► pp. 73 ff.
Zhou, Xinye, Yuan Gao & Xiaofei Lu
2023. Lexical complexity changes in 100 years’ academic writing: Evidence from Nature Biology Letters. Journal of English for Academic Purposes 64 ► pp. 101262 ff.
Baumann, Andreas & Katharina Sekanina
2022. Accounting for the relationship between lexical prevalence and acquisition with Bayesian networks and population dynamics. Linguistics Vanguard 8:1 ► pp. 209 ff.
Bernaisch, Tobias, Stefan Th. Gries & Benedikt Heller
2022. Theoretical models and statistical modelling of linguistic epicentres. World Englishes 41:3 ► pp. 333 ff.
2022. Technical vocabulary in languages for special purposes: The corpus-based Russian economics word list. Lingua 273 ► pp. 103326 ff.
McGrath, Darby & Cassi Liardét
2022. A corpus-assisted analysis of grammatical metaphors in successful student writing. Journal of English for Academic Purposes 56 ► pp. 101090 ff.
McGrath, Darby & Cassi Liardét
2023. Grammatical metaphor across disciplines: Variation, frequency, and dispersion. English for Specific Purposes 69 ► pp. 33 ff.
McGrath, Darby & Cassi Liardét
2025. A Grammatical Metaphor Word List. TESOL Quarterly 59:3 ► pp. 1667 ff.
Reynolds, Barry Lee & Chen Ding
2022. Effects of word-related factors on first and second language English readers’ incidental acquisition of vocabulary through reading an authentic novel. English Teaching: Practice & Critique 21:2 ► pp. 171 ff.
Serigos, Jacqueline
2022. Using automated methods to explore the social stratification of anglicisms in Spanish. Corpus Linguistics and Linguistic Theory 18:2 ► pp. 391 ff.
Wang, Zhong, Weiwei Fan & Alex Chengyu Fang
2022. Lexical Input in the Grammatical Expression of Stance: A Collexeme Analysis of the INTRODUCTORY IT PATTERN. Frontiers in Psychology 12
Zhang, Haomin, Yuting Han, Xing Zhang & Liuran Cui
2022. Frequency, Dispersion and Abstractness in the Lexical Sophistication Analysis of A Learner-Based Word Bank: Dimensionality Reduction and Identification. Journal of Quantitative Linguistics 29:2 ► pp. 195 ff.
Bittar, André, Sumithra Velupillai, Angus Roberts & Rina Dutta
2021. Using General-purpose Sentiment Lexicons for Suicide Risk Assessment in Electronic Health Records: Corpus-Based Analysis. JMIR Medical Informatics 9:4 ► pp. e22397 ff.
Candarli, Duygu
2021. A longitudinal study of multi-word constructions in L2 academic writing: the effects of frequency and dispersion. Reading and Writing 34:5 ► pp. 1191 ff.
Dushku, Silvana & Youngshil Paek
2021. Investigating ESL learners’ awareness of semantic prosody across proficiency levels. Language Awareness 30:3 ► pp. 234 ff.
Fitzgerald, Jill, Jackie Eunjung Relyea, Jeff Elmore & Elfrieda H. Hiebert
2021. Has the Presence of First‐Grade Core Reading Program Academic Vocabulary Changed Across Six Decades?. Reading Research Quarterly 56:4 ► pp. 737 ff.
2021. Disentangling Pantomime From Early Sign in a New Sign Language: Window Into Language Evolution Research. Frontiers in Psychology 12
Omidian, Taha & Anna Siyanova-Chanturia
2021. Parameters of variation in the use of words in empirical research writing. English for Specific Purposes 62 ► pp. 15 ff.
Rauhut, Alexander
2021. Exploring the Effect of Conversion on the Distribution of Inflectional Suffixes: A Multivariate Corpus Study. Zeitschrift für Anglistik und Amerikanistik 69:3 ► pp. 267 ff.
Schröter, Julian, Keli Du, Julia Dudar, Cora Rok & Christof Schöch
2021. From Keyness to Distinctiveness – Triangulation and Evaluationin Computational Literary Studies. Journal of Literary Theory 15:1-2 ► pp. 81 ff.
Winter, Bodo & Martine Grice
2021. Independence and generalizability in linguistics. Linguistics 59:5 ► pp. 1251 ff.
Xie, Wenxiu, Meng Ji, Mengdan Zhao, Tianqi Zhou, Fan Yang, Xiaobo Qian, Chi-Yin Chow, Kam-Yiu Lam & Tianyong Hao
2021. Detecting Symptom Errors in Neural Machine Translation of Patient Health Information on Depressive Disorders: Developing Interpretable Bayesian Machine Learning Classifiers. Frontiers in Psychiatry 12
Öksüz, Doğuş, Vaclav Brezina & Patrick Rebuschat
2021. Collocational Processing in L1 and L2: The Effects of Word Frequency, Collocational Frequency, and Association. Language Learning 71:1 ► pp. 55 ff.
2020. Who's afraid of phrasal verbs? The use of phrasal verbs in expert academic writing in the discipline of linguistics. Journal of English for Academic Purposes 43 ► pp. 100814 ff.
Buerki, Andreas
2020. Formulaic Language and Linguistic Change,
Burch, Brent & Jesse Egbert
2020. Zero-inflated beta distribution applied to word frequency and lexical dispersion in corpus linguistics. Journal of Applied Statistics 47:2 ► pp. 337 ff.
Burch, Brent & Jesse Egbert
2023. Word Use Equivalence and Hierarchical Word Tiers. Journal of Quantitative Linguistics 30:1 ► pp. 104 ff.
De Troij, Robbert & Freek Van de Velde
2020. Beyond Mere Text Frequency: Assessing Subtle Grammaticalization by Different Quantitative Measures. A Case Study on the Dutch Soort Construction. Languages 5:4 ► pp. 55 ff.
Degraeuwe, Jasper & Patrick Goethals
2020. La selección temática del vocabulario para fines didácticos: evaluación de un acercamiento cuantitativo. Revista de Lingüística y Lenguas Aplicadas 15:1 ► pp. 1 ff.
Delmonte, Rodolfo
2020. Venses HaSpeeDe2 & SardiStance: Multilevel Deep Linguistically Based Supervised Approach to Classification. In EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020, ► pp. 121 ff.
2020. The Use of Discourse Markers in Business English Textbooks: Issues in L2 Communicative Competence and Learners’ Input. In Discourse Markers and Beyond, ► pp. 119 ff.
Furkó, Péter B.
2020. Discourse Markers in Natural Conversations, Scripted Conversations and Political Interviews: Core and Peripheral Uses. In Discourse Markers and Beyond, ► pp. 39 ff.
Gries, Stefan Th. & Philip Durrant
2020. Analyzing Co-occurrence Data. In A Practical Handbook of Corpus Linguistics, ► pp. 141 ff.
Hollis, Geoff
2020. Delineating linguistic contexts, and the validity of context diversity as a measure of a word's contextual variability. Journal of Memory and Language 114 ► pp. 104146 ff.
Hsu, Chan-Chia
2020. Exploring recurrent frames in written Chinese. Corpora 15:3 ► pp. 291 ff.
Lin, You-Min & Michelle Y. Chen
2020. Understanding writing quality change: A longitudinal study of repeaters of a high-stakes standardized English proficiency test. Language Testing 37:4 ► pp. 523 ff.
Lorenzo-Dus, Nuria, Anina Kinzel & Matteo Di Cristofaro
2020. The communicative modus operandi of online child sexual groomers: Recurring patterns in their language use. Journal of Pragmatics 155 ► pp. 15 ff.
Meyer, Thomas George
2020. Difference as privilege: identity, citizenship and the recontextualisation of human rights in Japan’s social studies curriculum. Critical Studies in Education 61:1 ► pp. 17 ff.
Meyer, Thomas George
2023. Corpus approaches to the sociology of curricula: A methodological case study of human rights learning in Japan. Applied Corpus Linguistics 3:2 ► pp. 100057 ff.
Miller, Don
2020. Analysing Frequency Lists. In A Practical Handbook of Corpus Linguistics, ► pp. 77 ff.
Miller, Don
2022. Replication as a means of assessing corpus representativeness and the generalizability of specialized word lists. Applied Corpus Linguistics 2:3 ► pp. 100027 ff.
Neels, Jakob
2020. Lifespan change in grammaticalisation as frequency-sensitive automation: William Faulkner and thelet aloneconstruction. Cognitive Linguistics 31:2 ► pp. 339 ff.
Sun, Linlin & David Correia Saavedra
2020. Measuring grammatical status in Chinese through quantitative corpus analysis. Corpora 15:3 ► pp. 317 ff.
2019. Functional language in curriculum genres: Implications for testing international teaching assistants. Journal of English for Academic Purposes 41 ► pp. 100766 ff.
Cotos, Elena & Yoo‐Ree Chung
2018. Domain Description: Validating the Interpretation of the TOEFL iBT® Speaking Scores for International Teaching Assistant Screening and Certification Purposes. ETS Research Report Series 2018:1 ► pp. 1 ff.
Desagulier, Guillaume
2019. Can word vectors help corpus linguists?. Studia Neophilologica 91:2 ► pp. 219 ff.
Gabrielatos, Costas
2019. If-Conditionals and Modality: Frequency Patterns and Theoretical Explanations. Journal of English Linguistics 47:4 ► pp. 301 ff.
2019. A universal information theoretic approach to the identification of stopwords. Nature Machine Intelligence 1:12 ► pp. 606 ff.
Graves, Michael F., Jeff Elmore & Jill Fitzgerald
2019. The Vocabulary of Core Reading Programs. The Elementary School Journal 119:3 ► pp. 386 ff.
Hashimoto, Brett J. & Jesse Egbert
2019. More Than Frequency? Exploring Predictors of Word Difficulty for Second Language Learners. Language Learning 69:4 ► pp. 839 ff.
Koplenig, Alexander & Gareth J. Baxter
2019. A non-parametric significance test to compare corpora. PLOS ONE 14:9 ► pp. e0222703 ff.
Lukin, Annabelle
2019. War and Violence: Etymology, Definitions, Frequencies, Collocations. In War and Its Ideologies [The M.A.K. Halliday Library Functional Linguistics Series, ], ► pp. 81 ff.
McCallum, Lee
2019. Assessing Second Language Proficiency Under ‘Unequal’ Perspectives: A Call for Research in the MENA Region. In English Language Teaching Research in the Middle East and North Africa, ► pp. 3 ff.
Pham, Hien, Benjamin V. Tucker & R. Harald Baayen
2019. Constructing two vietnamese corpora and building a lexical database. Language Resources and Evaluation 53:3 ► pp. 465 ff.
Rastelli, Stefano
2019. The discontinuity model: Statistical and grammatical learning in adult second-language acquisition. Language Acquisition 26:4 ► pp. 387 ff.
Tidwell, Jacqueline Hettel
2019. From a Smoking Gun to Spent Fuel: Principled Subsampling Methods for Building Big Language Data Corpora from Monitor Corpora. Data 4:2 ► pp. 48 ff.
Brezina, Vaclav
2018. Statistics in Corpus Linguistics,
Brown, David West
2018. English and Empire,
Csomay, Eniko & Alexandra Prades
2018. Academic vocabulary in ESL student papers: A corpus-based study. Journal of English for Academic Purposes 33 ► pp. 100 ff.
2018. Comparingn-gram-based functional categories in originalversustranslated texts. Corpora 13:3 ► pp. 347 ff.
Gyeahyung Jeon & 문병열
2018. A Study on Statistical Techniques for Semantic Description of Grammatical Elements. Language & Information Society 33:null ► pp. 219 ff.
Heffernan, Kevin, Yusuke Imanishi & Masaru Honda
2018. Showcasing the interaction of generative and emergent linguistic knowledge with case marker omission in spoken Japanese. Glossa: a journal of general linguistics 3:1
Nelson, Robert
2018. How ‘chunky’ is language? Some estimates based on Sinclair's Idiom Principle. Corpora 13:3 ► pp. 431 ff.
Nelson, Robert
2024. Author’s Response. Journal of Quantitative Linguistics 31:2 ► pp. 161 ff.
2017. Kitap Tanıtımı. Dilbilim Araştırmaları Dergisi 28:2 ► pp. 93 ff.
Chen, Meilin
2017. Phraseology in English as an Academic Lingua Franca. In Handbook of Research on Individualism and Identity in the Globalized Digital Age [Advances in Human and Social Aspects of Technology, ], ► pp. 478 ff.
Dang, Thi Ngoc Yen, Averil Coxhead & Stuart Webb
2017. The Academic Spoken Word List. Language Learning 67:4 ► pp. 959 ff.
Deshors, Sandra C.
2017. Zooming in on Verbs in the Progressive: A Collostructional and Correspondence Analysis Approach. Journal of English Linguistics 45:3 ► pp. 260 ff.
Divjak, Dagmar
2017. The Role of Lexical Frequency in the Acceptability of Syntactic Variants: Evidence Fromthat‐Clauses in Polish. Cognitive Science 41:2 ► pp. 354 ff.
Divjak, Dagmar
2019. Frequency in Language,
Divjak, Dagmar
2025. Frequency in Language. In Reference Module in Social Sciences,
DUNN, JONATHAN
2017. Computational learning of construction grammars. Language and Cognition 9:2 ► pp. 254 ff.
2017. Why are grammatical elements more evenly dispersed than lexical elements? Assessing the roles of text frequency and semantic generality. Corpora 12:3 ► pp. 369 ff.
Wait, Charles, Tafadzwa Ruzive & Pierre le Roux
2017. The Influence of Financial Market Development on Economic Growth in BRICS Countries. International Journal of Management and Economics 53:1 ► pp. 7 ff.
CROSSLEY, SCOTT, KRISTOPHER KYLE & THOMAS SALSBURY
2016. A Usage‐Based Investigation of L2 Lexical Acquisition: The Role of Input and Output. The Modern Language Journal 100:3 ► pp. 702 ff.
Edwards, Alison & Rutger-Jan Lange
2016. In case ofinnovation. International Journal of Learner Corpus Research 2:2 ► pp. 252 ff.
Kyle, Kristopher & Scott Crossley
2016. The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing 34 ► pp. 12 ff.
Lijffijt, Jefrey, Terttu Nevalainen, Tanja Säily, Panagiotis Papapetrou, Kai Puolamäki & Heikki Mannila
2016. Significance testing of word frequencies in corpora. Digital Scholarship in the Humanities 31:2 ► pp. 374 ff.
Pimas, Oliver, Stefan Klampfl, Thomas Kohl, Roman Kern & Mark Kröll
2016. Generating Tailored Classification Schemas for German Patents. In Natural Language Processing and Information Systems [Lecture Notes in Computer Science, 9612], ► pp. 230 ff.
Rácz, Péter, Viktória Papp & Jennifer Hay
2016. Frequency and Corpora. In The Cambridge Handbook of Morphology, ► pp. 685 ff.
Tonkin, E.L.
2016. A Day at Work (with Text). In Working with Text, ► pp. 23 ff.
Vessey, Rachelle
2016. Approaches to Language Ideology. In Language and Canadian Media, ► pp. 59 ff.
Altmeyer, Stefan, Constantin Klein, Barbara Keller, Christopher F. Silver, Ralph W. Hood & Heinz Streib
Bubenhofer, Noah, Willi Lange, Saburo Okamura & Joachim Scharloth
2015. Wortschätze in Lehrbüchern für Deutsch als Fremdsprache – Möglichkeiten und Grenzen frequenzorientierter Ansätze. In Linguistik und Schulbuchforschung, ► pp. 85 ff.
2024.
Corrections to Nelson (2023):
DP
norm
and
D
KLnorm
are Not Wrong on Pi at All
. Journal of Quantitative Linguistics 31:1 ► pp. 43 ff.
Th. Gries, Stefan
2020. Analyzing Dispersion. In A Practical Handbook of Corpus Linguistics, ► pp. 99 ff.
Gries, Stefan Th. & Nick C. Ellis
2015. Statistical Measures for Usage‐Based Linguistics. Language Learning 65:S1 ► pp. 228 ff.
Koplenig, Alexander
2015. The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII. Digital Scholarship in the Humanities► pp. fqv037 ff.
Kyle, Kristopher & Scott A. Crossley
2015. Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL Quarterly 49:4 ► pp. 757 ff.
2014. Points, Depictions, Gestures and Enactment: Partly Lexical and Non-Lexical Signs as Core Elements of Single Clause-Like Units in Auslan (Australian Sign Language). Australian Journal of Linguistics 34:2 ► pp. 262 ff.
김미란, Jungha Hong & Jae-Woong Choe
2014. Distributional characteristics in Korean onset-nucleus sequences and hierarchical clustering of Korean vowels. Studies in Phonetics, Phonology, and Morphology 20:1 ► pp. 23 ff.
Crossley, Scott A., Nicholas Subtirelu & Tom Salsbury
2013. FREQUENCY EFFECTS OR CONTEXT EFFECTS IN SECOND LANGUAGE WORD LEARNING. Studies in Second Language Acquisition 35:4 ► pp. 727 ff.
Kusseling, Françoise & Deryle Lonsdale
2013. A Corpus-Based Assessment of French CEFR Lexical Content. The Canadian Modern Language Review 69:4 ► pp. 436 ff.
Wild, Kate, Adam Kilgarriff & David Tugwell
2013. The Oxford Children’s Corpus: Using a Children’s Corpus in Lexicography1. International Journal of Lexicography 26:2 ► pp. 190 ff.
Lijffijt, Jefrey, Panagiotis Papapetrou & Kai Puolamäki
2012. Size Matters: Finding the Most Informative Set of Window Lengths. In Machine Learning and Knowledge Discovery in Databases [Lecture Notes in Computer Science, 7524], ► pp. 451 ff.
Lijffijt, Jefrey, Panagiotis Papapetrou & Kai Puolamäki
2015. Size matters: choosing the most informative set of window lengths for mining patterns in event sequences. Data Mining and Knowledge Discovery 29:6 ► pp. 1838 ff.
Wilson, Andrew
2012. Using corpora in depth psychology: a trigram-based analysis of a corpus of fetish fantasies. Corpora 7:1 ► pp. 69 ff.
LIU, DILIN
2011. The Most Frequently Used English Phrasal Verbs in American and British English: A Multicorpus Examination. TESOL Quarterly 45:4 ► pp. 661 ff.
Chesley, Paula & R. Harald Baayen
2010. Predicting new words from newer words: Lexical borrowings in French. Linguistics 48:6
Gabrielatos, Costas, Eivind Nessa Torgersen, Sebastian Hoffmann & Susan Fox
2010. A Corpus-Based Sociolinguistic Study of Indefinite Article Forms in London English. Journal of English Linguistics 38:4 ► pp. 297 ff.
Kern, Roman, Christin Seifert & Michael Granitzer
2010. A hybrid system for German encyclopedia alignment. International Journal on Digital Libraries 11:2 ► pp. 75 ff.
Kern, Roman & Michael Granitzer
2009. Proceedings of the International Conference on Management of Emergent Digital EcoSystems, ► pp. 167 ff.
Kern, Roman & Michael Granitzer
2010. German Encyclopedia Alignment Based on Information Retrieval Techniques. In Research and Advanced Technology for Digital Libraries [Lecture Notes in Computer Science, 6273], ► pp. 315 ff.
[no author supplied]
2013. Web Corpus Construction [Synthesis Lectures on Human Language Technologies, ],
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.