This paper explores keywords, key part-of-speech categories and key semantic categories and their role in text analysis. The first part of the paper addresses a set of issues relating to the definition of keywords and their history, the settings used in deriving keywords, the choice of reference corpora, the different kinds of keyword that emerge in one’s results and the dispersion of keywords in one’s data. It argues, amongst other things, that keywords are the same as style markers, and that three types of keyword can be identified: interpersonal, textual and ideational. The second part of the paper addresses the question of what precisely is to be gained from analysing key part-of-speech or key semantic domains in addition to keywords. It shows that whilst in general they add little to a keyword analysis, which is in any case methodologically more robust, there are some significant specific benefits. Answers to many of the questions posed in this paper are illustrated by a study of character-talk from Shakespeare’s play Romeo and Juliet, and in this way this paper also makes a contribution to the fledging field of corpus stylistics.
2025. Comparative Analysis of Sorrow in Shakespeare’s Juliet and Homer’s Andromache. Literatūra 66:3 ► pp. 55 ff.
Chierichetti, Luisa
2025. El discurso telecinemático del guion a la pantalla: un estudio de caso basado en corpus. Círculo de Lingüística Aplicada a la Comunicación 102 ► pp. 155 ff.
Hoffmann, Christian R.
2025. Persuasive Politics. In Manipulation, Influence and Deception, ► pp. 63 ff.
2025. The Language of Praise and Worship: A Corpus Analysis of Register Variation in Christian Songs. Southeastern Philippines Journal of Research and Development 30:2 ► pp. 119 ff.
2025. Uncovering the functional aspect of translator style: corpus stylistic insights. Humanities and Social Sciences Communications 12:1
Wu, Kan & Defeng Li
2025. Keywords in Er Ma. In Researching Translator's Functional Style [New Frontiers in Translation Studies, ], ► pp. 65 ff.
Wu, Kan & Defeng Li
2025. Corpus Stylistics. In Researching Translator's Functional Style [New Frontiers in Translation Studies, ], ► pp. 19 ff.
Zhan, Hongwei
2025. Key cluster identification in literary texts using and comparing multiple measures: an exploratory comparative study and its implications. Digital Scholarship in the Humanities 40:2 ► pp. 668 ff.
Bonsu, Emmanuel Mensah & Samuel Kwesi Nkansah
2024. Re-visioning Ayi Kwei Armah’s Fragments: A Corpus Stylistic Analysis Using Wmatrix. Íkala, Revista de Lenguaje y Cultura 29:1 ► pp. 1 ff.
Busso, Lucia
2024. Power and Personal Experience in Online Anonymous Communities: A Corpus-Driven Exploration. Corpus Pragmatics 8:3 ► pp. 229 ff.
2024. Media Studies of Regulatory Value of Healthy Lifestyle in English Media: A Corpus-Based Approach. Nauchnyi dialog 13:6 ► pp. 186 ff.
Liu, Yanhong, Stephen May & Lawrence Jun Zhang
2024. The ideological underpinnings in university English textbooks in China: A critical discourse analysis and corpus linguistics approach. Journal of Multilingual and Multicultural Development 45:10 ► pp. 4448 ff.
Lutzky, Ursula
2024. “Doesn’t Really Answer My Question . . .”: Exploring Customer Service Interactions on Twitter. International Journal of Business Communication 61:1 ► pp. 92 ff.
Meng, Yingying, Yuwei Wan & Chunyu Kit
2024. Du Fu’s conspicuous negativity and Li Bai’s hidden positivity: a sentiment comparison and exploration. Digital Scholarship in the Humanities 39:1 ► pp. 280 ff.
Ren, Fei-Yu
2024. "Nolite te bastardes carborundorum": Corpus Stylistics and Language of Margaret Atwood's The Handmaid's Tale. Journal of Social Science Humanities and Literature 7:2 ► pp. 156 ff.
Gast, Volker, Christian Wehmeier & Dirk Vanderbeke
2023. A Register-Based Study of Interior Monologue in James Joyce’s Ulysses. Literature 3:1 ► pp. 42 ff.
Janda, Laura A., Masako Fidler, Václav Cvrček & Anna Obukhova
2023. The case for case in Putin’s speeches. Russian Linguistics 47:1 ► pp. 15 ff.
Lugea, Jane & Brian Walker
2023. Fictional Character. In Stylistics, ► pp. 145 ff.
Lugea, Jane & Brian Walker
2023. Mind Style. In Stylistics, ► pp. 201 ff.
Lugea, Jane & Brian Walker
2023. Style: Text, Cognition and Corpora. In Stylistics, ► pp. 1 ff.
Nieto Caballero, Guadalupe
2023. Benito Pérez Galdós a través de las humanidades digitales: propuesta metodológica para el análisis literario con herramientas de corpus. In Literatura, didáctica y humanidades digitales: aportaciones para la docencia y la investigación,
2023. The multiplex relations between cities: a lexicon-based approach to detect urban systems. Regional Studies 57:8 ► pp. 1592 ff.
Tully, Cassandra S.
2023. Book Reviews: Estilística de corpus: Nuevos enfoques en el análisis de textos literarios. Language and Literature: International Journal of Stylistics 32:3 ► pp. 376 ff.
Weninger, Csilla & Danyun Li
2023. Performing microcelebrity: Analyzing Papi Jiang's online persona through stance and style. Language in Society 52:2 ► pp. 263 ff.
2023. Discussion and Conclusions. In Media Representations of Macau’s Gaming Industry in Greater China, ► pp. 131 ff.
Wu, Yuxi
2023. Macroscopic Analysis on Macau’s Gaming Industry at the Semantic Level. In Media Representations of Macau’s Gaming Industry in Greater China, ► pp. 95 ff.
Wu, Yuxi
2023. News Discourse and Relevant Linguistic Studies. In Media Representations of Macau’s Gaming Industry in Greater China, ► pp. 13 ff.
Ambele, Eric A. & Richard Watson Todd
2022. Translanguaging patterns in everyday urban conversations in Cameroon. International Journal of the Sociology of Language 2022:273 ► pp. 181 ff.
AYDİNGULER, Zeynep & Meltem MUŞLU
2022. A Corpus-Based Analysis of Virginia Woolf and Arnold Bennett. Söylem Filoloji Dergisi 7:1 ► pp. 237 ff.
Boas, Evert van Emde
2022. Mind Style, Cognitive Stylistics, andĒthopoiiain Lysias. Trends in Classics 14:2 ► pp. 233 ff.
Burkette, Allison & Robin Skeates
2022. Words that Archaeologists Choose. Journal of Mediterranean Archaeology 35:1 ► pp. 85 ff.
2022. Using corpora to reveal style in translation: The case of The Song of Everlasting Sorrow. Frontiers in Psychology 13
Moustafa, Basant S. M.
2022. A comparative corpus stylistic analysis of thematization and characterization in Gordimer’s My Son’s Story and Coetzee’s Disgrace
. Open Linguistics 8:1 ► pp. 46 ff.
Norledge, Jessica
2022. Towards a Poetics of Dystopia. In The Language of Dystopia [Palgrave Studies in Language, Literature and Style, ], ► pp. 1 ff.
Palayon, Raymund T., Richard Watson Todd & Sompatu Vungthong
2022. From the temple of life to the temple of death: keyness analyses of the transitions of a cult. Corpora 17:3 ► pp. 331 ff.
Palayon, Raymund T., Richard Watson Todd & Sompatu Vungthong
2025. Multifaceted approach of corpus analysis for characterizing Christian religious groups. Critical Research on Religion 13:2 ► pp. 222 ff.
Pojprasat, Somboon
2022. A pragmatic analysis of Shylock’s use ofthouandyou. Open Linguistics 8:1 ► pp. 496 ff.
Sung, Sheng-Feng, Cheng-Yang Hsieh & Ya-Han Hu
2022. Early Prediction of Functional Outcomes After Acute Ischemic Stroke Using Unstructured Clinical Text: Retrospective Cohort Study. JMIR Medical Informatics 10:2 ► pp. e29806 ff.
Tsai, Hui-Chu, Cheng-Yang Hsieh & Sheng-Feng Sung
2022. Application of machine learning and natural language processing for predicting stroke-associated pneumonia. Frontiers in Public Health 10
Yang, Na & Zihe Wang
2022. Addressing as a gender-preferential way for suggestive selling in Chinese e-commerce live streaming discourse: A corpus-based approach. Journal of Pragmatics 197 ► pp. 43 ff.
Ajšić, Adnan
2021. Data and Methods. In Language and Ethnonationalism in Contemporary West Central Balkans, ► pp. 47 ff.
Birchfield, Alexandra & Rolando Coto-Solano
2021. “I am not that I play” – The use of hypercorrection in the performance of gender by Shakespeare’s ‘breeches’ parts. Journal of Historical Sociolinguistics 7:1 ► pp. 27 ff.
2020. Depictions of deception: A corpus-based analysis of five Shakespearean characters. Language and Literature: International Journal of Stylistics 29:3 ► pp. 246 ff.
Culpeper, Jonathan & Qian Kan
2020. Communicative Styles, Rapport, and Student Engagement: An Online Peer Mentoring Scheme. Applied Linguistics 41:5 ► pp. 756 ff.
Anna De Fina & Alexandra Georgakopoulou
2020. The Cambridge Handbook of Discourse Studies,
Demmen, Jane
2020. Issues and challenges in compiling a corpus of Early Modern English plays for comparison with those of William Shakespeare. ICAME Journal 44:1 ► pp. 37 ff.
Fuster-Márquez, Miguel
2020. «Con tan enfermo cerebro»: Fraseología recurrente en Corazón tan Blanco de Javier Marías. Círculo de Lingüística Aplicada a la Comunicación 83 ► pp. 41 ff.
Heo, Tak Sung, Yu Seop Kim, Jeong Myeong Choi, Yeong Seok Jeong, Soo Young Seo, Jun Ho Lee, Jin Pyeong Jeon & Chulho Kim
2020. Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI. Journal of Personalized Medicine 10:4 ► pp. 286 ff.
Murphy, Sean, Dawn Archer & Jane Demmen
2020. Mapping the links between gender, status and genre in Shakespeare’s plays. Language and Literature: International Journal of Stylistics 29:3 ► pp. 223 ff.
Obeid, Jihad S, Jennifer Dahne, Sean Christensen, Samuel Howard, Tami Crawford, Lewis J Frey, Tracy Stecker & Brian E Bunnell
2020. Identifying and Predicting Intentional Self-Harm in Electronic Health Record Clinical Notes: Deep Learning Approach. JMIR Medical Informatics 8:7 ► pp. e17784 ff.
Obeid, Jihad S, Matthew Davis, Matthew Turner, Stephane M Meystre, Paul M Heider, Edward C O'Bryan & Leslie A Lenert
2020. An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. Journal of the American Medical Informatics Association 27:8 ► pp. 1321 ff.
Rayson, Paul & Amanda Potts
2020. Analysing Keyword Lists. In A Practical Handbook of Corpus Linguistics, ► pp. 119 ff.
Kim, Chulho, Vivienne Zhu, Jihad Obeid, Leslie Lenert & John Shawe-Taylor
2019. Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke. PLOS ONE 14:2 ► pp. e0212778 ff.
Mahlberg, Michaela, Viola Wiegand, Peter Stockwell & Anthony Hennessey
2019. Speech-bundles in the 19th-century English novel. Language and Literature: International Journal of Stylistics 28:4 ► pp. 326 ff.
Sauro, Shannon & Björn Sundmark
2019. Critically examining the use of blog-based fanfiction in the advanced language classroom. ReCALL 31:01 ► pp. 40 ff.
Werner, Valentin
2019. Assessing hip-hop discourse: Linguistic realness and styling. Text & Talk 39:5 ► pp. 671 ff.
Bednarek, Monika
2018. Language and Television Series,
Bridle, Marcus
2018. Male blues lyrics 1920 to 1965: A corpus based analysis. Language and Literature: International Journal of Stylistics 27:1 ► pp. 21 ff.
Fidler, Masako & Václav Cvrček
2018. Going Beyond “Aboutness”: A Quantitative Analysis of Sputnik Czech Republic. In Taming the Corpus [Quantitative Methods in the Humanities and Social Sciences, ], ► pp. 195 ff.
Fidler, Masako & Václav Cvrček
2019. Keymorph analysis, or how morphosyntax informs discourse. Corpus Linguistics and Linguistic Theory 15:1 ► pp. 39 ff.
Hou, Zhide
2018. The American Dream meets the Chinese Dream: a corpus-driven phraseological analysis of news texts. Text & Talk 38:3 ► pp. 317 ff.
Malmström, Hans, Diane Pecorari & Philip Shaw
2018. Words for what? Contrasting university students' receptive and productive academic vocabulary needs. English for Specific Purposes 50 ► pp. 28 ff.
2018. A Corpus-Based Approach to Waiting for Godot’s Stage Directions: A Comparison between the French and the English Version. In Voice and Discourse in the Irish Context, ► pp. 139 ff.
VERONICO, N. TARRAYO
2018. BEAUTY IN BREVITY: CAPTURING THE NARRATIVE STRUCTURE OF FLASH FICTION BY FILIPINO WRITERS. i-manager’s Journal on English Language Teaching 8:2 ► pp. 36 ff.
Villanueva, Louise S., Mary Aizel C. Dolom & Jennifer S. Belen
2018. Genre analysis of the “About Us” sections of Asian Association of Open Universities websites. Asian Association of Open Universities Journal 13:1 ► pp. 37 ff.
2017. Keywords, semantic domains and intercultural competence in the British and Taiwanese Teenage Intercultural Communication Corpus. Corpora 12:2 ► pp. 279 ff.
Teresa Palmateer
2017. Corpus Analysis of Argumentative Essays from Macroscopic and Microscopic Perspectives. Korean Journal of English Language and Linguistics 17:3 ► pp. 497 ff.
Brooke, Julian, Adam Hammond & Graeme Hirst
2016. Using Models of Lexical Style to Quantify Free Indirect Discourse in Modernist Fiction. Digital Scholarship in the Humanities► pp. fqv072 ff.
Ebner, Carmen
2016. Language guardian BBC? Investigating the BBC's language advice in its 2003 News Styleguide. Journal of Multilingual and Multicultural Development 37:3 ► pp. 308 ff.
2016. An analysis of viewpoints by the use of frequent multi-word sequences in DH Lawrence’s Lady Chatterley’s Lover. Language and Literature: International Journal of Stylistics 25:2 ► pp. 159 ff.
Mahlberg, Michaela, Peter Stockwell, Johan de Joode, Catherine Smith & Matthew Brook O'Donnell
2016. CLiC Dickens: novel uses of concordances for the integration of corpus stylistics and cognitive poetics. Corpora 11:3 ► pp. 433 ff.
Motschenbacher, Heiko
2016. A corpus linguistic study of the situatedness of English pop song lyrics. Corpora 11:1 ► pp. 1 ff.
Motschenbacher, Heiko
2016. Prevalent Discourses in ESC Lyrics. In Language, Normativity and Europeanisation, ► pp. 279 ff.
2021. Corpus-Based Considerations on Critical Literacy in ELT: The Linguistic Representation of Ricky Martin in the News Media. In Intersectional Perspectives on LGBTQ+ Issues in Modern Language Teaching and Learning, ► pp. 217 ff.
Tabbert, Ulrike
2016. Naming and describing offenders and victims. In Language and Crime, ► pp. 43 ff.
Terblanche, Lize
2016. The language of stories: Modelling East African fiction and oral narratives. Southern African Linguistics and Applied Language Studies 34:1 ► pp. 27 ff.
Brocklebank, Paul
2015. IDENTIFYING DISTRIBUTIONAL PATTERNS IN EIGHTEENTH-CENTURY PERIODICAL ESSAYS. Discourse and Interaction 8:1 ► pp. 5 ff.
Lin, Yen-Liang
2015. Using key part-of-speech analysis to examine spoken discourse by Taiwanese EFL learners. ReCALL 27:3 ► pp. 304 ff.
Mahlberg, Michaela, Catherine Smith & Simon Preston
2015. I will proclaim myself what I am: Corpus stylistics and the language of Shakespeare’s soliloquies. Language and Literature: International Journal of Stylistics 24:4 ► pp. 338 ff.
Blaxter, Tam T.
2014. Applying keyword analysis to gendered language in theÍslendingasögur. Nordic Journal of Linguistics 37:2 ► pp. 169 ff.
전지은
2014. A study on key words analysis of the public and private discourse in Korean spoken language. Discourse and Cognition 21:1 ► pp. 105 ff.
전지은
2017. A Study on the Authenticity of the Dialogue in the Intermediate Korean Textbooks. Language Facts and Perspectives 41:null ► pp. 273 ff.
Ha, Myung-Jeong
2013. Corpus-Based Literary Analysis. The Journal of the Korea Contents Association 13:9 ► pp. 440 ff.
Taylor, Charlotte
2013. Searching for similarity using corpus-assisted discourse studies. Corpora 8:1 ► pp. 81 ff.
van Ostade, Ingrid Tieken-Boon
2013. Studying attitudes to English usage. English Today 29:4 ► pp. 3 ff.
Mahlberg, Michaela
2012. Corpus Analysis of Literary Texts. In The Encyclopedia of Applied Linguistics,
Zare-ee, Abbas & Sheena Kuar
2012. Do Male Undergraduates Write More Argumentatively?. Procedia - Social and Behavioral Sciences 46 ► pp. 5787 ff.
2011. A bilingual corpus-assisted discourse study of the construction of nationhood and belonging in Quebec. Discourse & Society 22:1 ► pp. 21 ff.
Hubbard, Hilton
2010. Stance and style: A corpus-driven perspective on television coverage of the 2OO9 South African general election. Language Matters 41:1 ► pp. 3 ff.
McIntyre, Dan
2010. The year’s work in stylistics 2009. Language and Literature: International Journal of Stylistics 19:4 ► pp. 396 ff.
McIntyre, Dan
2012. Prototypical Characteristics of Blockbuster Movie Dialogue: A Corpus Stylistic Analysis. Texas Studies in Literature and Language 54:3 ► pp. 402 ff.
McIntyre, Dan
2012. Corpora and Literature. In The Encyclopedia of Applied Linguistics,
McIntyre, Dan
2013. Language and Style in David Peace’s 1974: a Corpus Informed Analysis. Etudes de stylistique anglaise :4 ► pp. 133 ff.
Romaine, Suzanne
2010. 19th Century Key Words, Key Semantic Domains and Affect: “In the Rich Vocabulary of Love ‘Most Dearest’ be a True Superlative”. Studia Neophilologica 82:1 ► pp. 12 ff.
2025. Persuasion and (New) Contexts of Use. In Manipulation, Influence and Deception, ► pp. 43 ff.
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.