Assessing the potential of using large language models for pragmatic annotation of historical texts: A case of epistemic stance in Early Modern English

Huang, Ding; Xu, Jiajin; Song, Yingming; Yu, Ruchen

doi:10.1075/jhp.25011.hua

Article published In: Journal of Historical Pragmatics
Vol. 26:3 (2025) ► pp.406–437

Get fulltext from our e-platform

Download PDF

Download EPUB

Assessing the potential of using large language models for pragmatic annotation of historical texts

A case of epistemic stance in Early Modern English

Ding Huang | Beijing Foreign Studies University

Jiajin Xu | Beijing Foreign Studies University

Yingming Song | Beijing Foreign Studies University

Ruchen Yu | Beijing Foreign Studies University

Published online: 4 December 2025

https://doi.org/10.1075/jhp.25011.hua

Abstract

This study investigates the viability of using large language models (llms) to conduct pragmatic annotations of historical texts. The investigation employs a small corpus of witness depositions and compares Claude 3.5 Sonnet — an llm that excels in reasoning over text — with two human annotators over their performance in the pragmatic annotation of Early Modern English (emode) texts. The study also compares the model’s annotations on modernised and original versions of the corpus to explore if emode spelling variations affect its performance. The results revealed that although the model’s annotations were less satisfactory than human annotators’, it achieved moderate inter-coder agreement and balanced precision and recall, which is desirable in this particular task by maximising identification without sacrificing accuracy. Furthermore, the prevalent spelling variations did not significantly impair the model’s ability to recognise epistemic stance in the original emode texts. Therefore, we propose a human–ai collaboration approach for historical pragmatic annotation.

Keywords: Early Modern English, epistemic stance, historical pragmatics, large language models, pragmatic annotation

Article outline

1.Introduction
2.Background and literature review
- 2.1Epistemic stance in Early Modern English
- 2.2Challenges in pragmatic analysis of historical text with a corpus linguistic approach
- 2.3Application of large language models to historical texts
- 2.4Pragmatic annotation with large language models
- 2.5Common approaches in prompt engineering
3.Methods and procedures
- 3.1Source of data
- 3.2Llm annotation procedure and prompt design
- 3.3Llm annotation protocol
4.Results
- 4.1Large language model versus human annotators
- 4.2Performance of Claude 3.5 Sonnet in annotating Early Modern English texts in original spelling
- 4.3Error analysis
  - 4.3.1Annotating the epistemic use of modals
  - 4.3.2Annotating the epistemic use of the emphatic
  - 4.3.3Annotating certainty/likelihood verbs and communication verbs
  - 4.3.4Annotating certainty and likelihood adjectives and adverbs
5.Discussion
6.Conclusion
Notes
References

References (54)

References

Aikhenvald, Alexandra Y. 2004. Evidentiality. Oxford: Oxford University Press.

Anthropic. 2024. “Introducing Claude 3.5 Sonnet”. Anthropic. Published 21 June 2024. Accessed 23 May 2025 at: [URL]

Biber, Douglas. 2004. “Historical Patterns for the Grammatical Marking of Stance: A Cross-Register Comparison”. Journal of Historical Pragmatics 5 (1): 107–136.

Boggel, Sandra. 2009. Metadiscourse in Middle English and Early Modern English Religious Texts: A Corpus-based Study. Frankfurt am Main: Peter Lang.

Bromhead, Helen. 2009. The Reign of Truth and Faith: Epistemic Expressions in 16th and 17th Century English. Berlin and New York: Mouton de Gruyter.

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever and Dario Amodei. 2020. “Language Models Are Few-shot Learners”. [v4] Wednesday 22 July 2020. arXiv. Accessed 20 January 2025.

Campesato, Oswald. 2024. Large Language Models: An Introduction. Boston: Mercury Learning and Information.

Chafe, Wallace L. and Johanna Nichols (eds). 1986. Evidentiality: The Linguistic Coding of Epistemology. Norwood, New Jersey: Ablex.

Chockalingam, Annamalai, Ankur Patel, Shashank Verma and Tiffany Yeung. 2023. A Beginner’s Guide to Large Language Models: Part 1. NVIDIA. See: [URL]

Devlin, Jacob, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 11 (Long and Short Papers), 4171–4186. Minneapolis, Minnesota: Association for Computational Linguistics.

Dong, Qingxiu, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li and Zhifang Sui. 2024. “A Survey on In-Context Learning”. In Yaser Al-Onaizan, Mohit Bansal and Yun-Nung Chen (eds), Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 1107–1128. Miami, Florida, USA. 12–16 November 2024. Kerrville: Association for Computational Linguistics.

Faller, Martina T. 2002. “Semantics and Pragmatics of Evidentials in Cuzco Quechua”. (PhD thesis.) Stanford, California: Stanford University. See: [URL]

Fonteyn, Lauren. 2020. “What about Grammar? Using BERT Embeddings to Explore Functional-Semantic Shifts of Semi-lexical and Grammatical Constructions”. In Proceedings of the Workshop on Computational Humanities Research (CHR 2020), volume 27231 of CEUR Workshop Proceedings, 257–268. See: [URL]

Gao, Yunfan, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang and Haofen Wang. 2024. “Retrieval-Augmented Generation for Large Language Models: A Survey”. [v5] Wednesday 27 March 2024. arXiv. Accessed 7 June 2025.

Garside, Roger, Geoffrey Leech and Tony McEnery (eds). 1997. Corpus Annotation: Linguistic Information from Computer Text Corpora. London and New York: Routledge.

Gisev, Natasa, J. Simon Bell and Timothy F. Chen. 2013. “Interrater Agreement and Interrater Reliability Key Concepts, Approaches, and Applications”. Research in Social and Administrative Pharmacy 9 (3): 330–338.

Giulianelli, Mario, Marco Del Tredici and Raquel Fernández. 2020. “Analysing Lexical Semantic Change with Contextualised Word Representations”. In Dan Jurafsky, Joyce Chai, Natalie Schluter and Joel Tetreault (eds), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 3960–3973. Online. 5–10 July 2020. Kerrville: Association for Computational Linguistics. See: [URL].

Gray, Bethany, Douglas Biber and Turo Hiltunen. 2011. “The Expression of Stance in Early (1665–1712) Publications of the Philosophical Transactions and Other Contemporary Medical Prose: Innovations in a Pioneering Discourse”. In Irma Taavitsainen and Päivi Pahta (eds), Medical Writing in Early Modern English, 221–257. Cambridge: Cambridge University Press.

Grund, Peter J. 2012. “The Nature of Knowledge: Evidence and Evidentiality in the Witness Depositions from the Salem Witch Trials”. American Speech 87 (1): 7–38.

2017. “Description, Evaluation and Stance: Exploring the Forms and Functions of Speech Descriptors in Early Modern English”. Nordic Journal of English Studies 16 (1): 41–73.

Harju, Anika and Rob van der Goot. 2025. “How to Age BERT Well: Continuous Training for Historical Language Adaptation”. In Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan and Lasitha Uyangodage (eds), Proceedings of the First Workshop on Language Models for Low-Resource Languages, 258–267. Abu Dhabi, UAE. 20 January 2025. Kerrville: Association for Computational Linguistics.

Hiltunen, Turo and Jukka Tyrkkö. 2011. “Verbs of Knowing: Discursive Practices in Early Modern Vernacular Medicine”. In Irma Taavitsainen and Päivi Pahta (eds), Medical Writing in Early Modern English, 44–73. Cambridge: Cambridge University Press.

Huang, Ding. 2023. “Formulaic Sequences in Early Modern English: A Corpus-Assisted Historical Pragmatic Study”. (PhD thesis.) Heidelberg, Germany: Heidelberg University.

Huang, Ding and Jiajin Xu. 2025. “Supplementary Materials”. ResearchGate.

Jurafsky, Daniel and James H. Martin. 2025. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models. (Third edition.) Online manuscript released 12 January 2025. See: [URL]

Kamath, Uday, Kevin Keenan, Garrett Somers and Sarah Sorenson. 2024. Large Language Models: A Deep Dive. Bridging Theory and Practice. Cham, Switzerland: Springer.

Kärkkäinen, Elise. 2003. Epistemic Stance in English Conversation: A Description of Its Interactional Functions, with A Focus on ‘I think’. Amsterdam: John Benjamins.

Kytö, Merja and Terry Walker. 2006. Guide to A Corpus of English Dialogues 1560–1760. Uppsala: Acta Universitatis Upsaliensis.

Landert, Daniela. 2024. Methods in Historical Corpus Pragmatics: Epistemic Stance in Early Modern English. Cambridge and New York: Cambridge University Press.

Landis, J. Richard and Gary G. Koch. 1977. “The Measurement of Observer Agreement for Categorical Data”. Biometrics 33 (1): 159–174.

Liu, Zhiwei, Kailai Yang, Tianlin Zhang, Qianqian Xie and Sophia Ananiadou. 2024. “Emollms: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis”. [v2] Tuesday 18 June 2024. arXiv. Accessed 20 January 2025.

McEnery, Tony and Andrew Hardie. 2012. Corpus Linguistics. Cambridge: Cambridge University Press.

Manjavacas, Enrique and Lauren Fonteyn. 2021. “MacBERTh: Development and Evaluation of a Historically Pre-trained Language Model for English (1450–1950)”. In Mika Hämäläinen, Khalid Alnajjar, Niko Partanen and Jack Rueter (eds), Proceedings of the Workshop on Natural Language Processing for Digital Humanities (NLP4DH 2021), 23–36. Online. 19 December 2021. NIT Silchar, India: the Natural Language Processing Association of India (NLPAI). See: [URL]

. 2022. “Adapting vs. Pre-training Language Models for Historical Languages”. Journal of Data Mining & Digital Humanities NLP4DH1: 1–19.

Meta AI. 2024. “Introducing Meta Llama 3: The Most Capable Openly Available llm to Date”. Meta AI. Published 18 April 2024. Accessed 23 May 2025 at: [URL]

Naveed, Humza, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes and Ajmal Mian. 2024. “A Comprehensive Overview of Large Language Models”. [v10] Thursday 17 October 2024. arXiv. Accessed 20 January 2025.

Nuyts, Jan. 2000. Epistemic Modality, Language and Conceptualization: A Cognitive-Pragmatic Perspective. Amsterdam: John Benjamins.

OpenAI. 2024. “Hello GPT-4o”. OpenAI. Published May 13, 2024. Accessed 23 May 2025 at: [URL]

. n.d. “Retrieval Augmented Generation (RAG) and Semantic Search for GPTs”. OpenAI Help Center. Accessed 7 June 2025 at: [URL]

Qiu, Xipeng, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai and Xuanjing Huang. 2020. “Pre-Trained Models for Natural Language Processing: A Survey”. Science China Technological Sciences 631: 1872–1897.

Qwen Team. 2024. “Hello Qwen2”. Qwen. Published 15 July 2024. Accessed 23 May 2025 at: [URL]

Simon-Vandenbergen, Anne-Marie and Karin Aijmer. 2007. The Semantic Field of Modal Certainty: A Corpus-based Study of English Adverbs. Berlin and New York: Mouton de Gruyter.

Skansi, Sandro. 2018. Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence. Cham, Switzerland: Springer.

Squartini, Mario. 2016. “Interactions between Modality and Other Semantic Categories”. In Jan Nuyts and Johan van der Auwera (eds), The Oxford Handbook of Modality and Mood, 50–67. Oxford: Oxford University Press.

Taavitsainen, Irma. 2018. “Historical Corpus Pragmatics”. In Andreas H. Jucker, Klaus P. Schneider and Wolfram Bublitz (eds), Methods in Pragmatics, 527–553. Berlin and Boston: De Gruyter Mouton.

Taavitsainen, Irma and Andreas H. Jucker. 2010. “Trends and Developments in Historical Pragmatics”. In Andreas H. Jucker and Irma Taavitsainen (eds), Historical Pragmatics, 3–30. Berlin and New York: De Gruyter Mouton.

Tharwat, Alaa. 2021. “Classification Assessment Methods”. Applied Computing and Informatics 17 (1): 168–192.

Varnum, Michael E. W., Nicolas Baumard, Mohammad Atari and Kurt Gray. 2024. “Large Language Models Based on Historical Text Could Offer Informative Tools for Behavioral Science”. PNAS 121 (42): e2407639121.

Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le and Denny Zhou. 2022. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”. On the 36th Conference on Neural Information Processing Systems (NeurIPS 2022). New Orleans, USA and online. [v6] 10 January 2023. arXiv. Accessed 18 January 2025.

Whitt, Richard J. 2023. “Epistemic Space and Key Concepts in Early and Late Modern Medical Discourse: An Exploration of Two Genres”. English Language and Linguistics 27 (2): 241–269.

Yao, Ben, Yazhou Zhang, Qiuchi Li and Jing Qin. 2024a. “Is Sarcasm Detection a Step-by-Step Reasoning Process in Large Language Models?” [v2] 24 August 2024. arXiv. Accessed 17 January 2025.

Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao and Karthik Narasimhan. 2024b. “Tree of Thoughts: Deliberate Problem Solving with Large Language Models”. [v2] Sunday 3 December 2023. arXiv. Accessed 17 January 2025.

Yu, Danni, Luyang Li, Hang Su and Matteo Fuoli. 2024. “Assessing the Potential of llm-assisted Annotation for Corpus-based Pragmatics and Discourse Analysis: The Case of Apology”. International Journal of Corpus Linguistics 29 (4): 534–561.

Zhao, Wayne Xin, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie and Ji-Rong Wen. 2024. “A Survey of Large Language Models”. [v15] Sunday 13 October 2024. arXiv. Accessed 20 January 2025.