Chapter 17. Generative artificial intelligence tools

Xu, Yiran; Polio, Charlene

doi:10.1075/rmal.15.17xu

In:Digital and Internet-Based Research Methods in Applied Linguistics
Edited by Matt Kessler
[Research Methods in Applied Linguistics 15] 2026
► pp. 362–385

Get fulltext from our e-platform

Download Book PDF

Download Book EPUB

Chapter 17
Generative artificial intelligence tools

Yiran Xu | University of California, Merced

Charlene Polio | Michigan State University

Published online: 5 January 2026

https://doi.org/10.1075/rmal.15.17xu

Abstract

This chapter discusses how generative artificial intelligence (GenAI) tools, particularly large language models (LLMs) like ChatGPT, are emerging as powerful web-based tools for research purposes, such as data analysis, in applied linguistics research. While much attention has focused on pedagogical applications, we review how GenAI can be leveraged to support various stages of the research processes in empirical studies, such as instrument design, automated coding, text annotation, and qualitative data analysis. We address key concerns around validity and reliability as well as ethical considerations related to transparency, data privacy, and potential bias in AI-generated output. Given that GenAI is in the early stage of research application, we describe its current capacities and limitations based on emerging empirical research and propose promising directions for future studies.

Article outline

1.Introduction
2.Frequently asked research questions
- Use of GenAI in the research workflow
- Validity and reliability of GenAI for quantitative studies
- Ethical concerns
3.Implementation
- Designing assessment materials and research instruments
- Assessing speaking and writing
- Analyzing learner data
- Transcription and thematic data analysis
- Corpus-based analyses and text annotation
- Research article writing and revision
- Validity and reliability of GenAI
4.Example studies
- Mizumoto and Eguchi (2023)
- Pfau et al. (2023) and Xu et al. (2024)
- Casal and Kessler (2023)
- Kim and Lu (2024)
- Morgan (2023)
- Curry et al. (2024)
5.Ethics and research integrity considerations
- Privacy and confidentiality
- Bias and transparency
- Attribution of work
6.Challenges and issues
- Open science and transparency
- Replication
- Quality of the output
- Access
7.Future research directions
References

References (46)

References

Belal, M., She, J., & Wong, S. (2023). Leveraging ChatGPT as text annotation tool for sentiment analysis. arXiv:2306.17177.

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021, March). On the dangers of stochastic parrots: Can language models be too big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610–623). ACM.

Brand, J., Israeli, A., & Ngwe, D. (2024). Using LLMs for market research. Harvard Business School Working Paper, 23–062.

Byrd, A., Flores, L., Green, D., Hassel, H., Johnson, S., Kirschenbaum, M., … & Mills, A. (2023). MLA-CCCC joint task force on writing and AI working paper: Overview of the issues, statement of principles, and recommendations. MLA-CCCC Joint Task Force. Retrieved on 24 September 2025 from [URL]

Cabral, S., Restrepo, D., Kanjee, Z., Wilson, P., Crowe, B., Abdulnour, R. E., Rodman, A. (2024). Clinical reasoning of a generative artificial intelligence model compared with physicians. JAMA Internal Medicine, 184(5), 581–583.

Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing?: A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3), 100068.

Chun, J. Y., & Barley, N. (2024). A Comparative analysis of multiple-choice questions: ChatGPT-generated items vs. human-developed items. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 118–136). Iowa State University Digital Press.

Crosthwaite, P., & Baisa, V. (2023). Generative AI and the end of corpus-assisted data-driven learning? Not so fast! Applied Corpus Linguistics, 3(3), 100066.

Curry, N., Baker, P., & Brookes, G. (2024). Generative AI for corpus approaches to discourse studies: A critical evaluation of ChatGPT. Applied Corpus Linguistics, 4(1), 100082.

Gao, S., Gales, M., & Xu, J. (2024). Detecting aberrant responses in automated L2 spoken English assessment. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 96–117). Iowa State University Digital Press.

Iyer, A., Vojjala, S., & Andrew, J. (2024). Augmenting sentiments into Chat-GPT using facial emotion recognition. 10th International Conference on Advanced Computing and Communication Systems (ICACCS), 69–74.

Jia, Y., & Aryadoust, V. (2024). The utility of generative Artificial Intelligence in rating interpreters’ accuracy: A case study of ChatGPT-4. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 59–72). Iowa State University Digital Press.

Johnson, C. W. & Paulus, T. M. (2024). Generating a reflexive AI-assisted workflow for academic writing. The Qualitative Report, 29, 2772–2792.

Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. arXiv preprint arXiv:2004.09095.

Khalifa, A., & Albadawy, M. (2024). Using artificial intelligence in academic writing and research: An essential productivity tool. Computer Methods and Programs in Biomedicine Update, 5, 100145.

Kim, H., Baghestani, Sh., Yin, Sh., Karatay, Y., Kurt, S., Beck, J., & Karatay, L. (2024). ChatGPT for writing evaluation: Examining the accuracy and reliability of AI-generated scores compared to human raters. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 73–95). Iowa State University Digital Press.

Kim, M., & Lu, X. (2024). Exploring the potential of using ChatGPT for rhetorical move-step analysis: The impact of prompt refinement, few-shot learning, and fine-tuning. Journal of English for Academic Purposes, 71, 101422.

Kobak, D., Márquez, R. G., Horvát, E. Á., & Lause, J. (2024). Delving into ChatGPT usage in academic writing through excess vocabulary.

Koltovskaia, S., Rahmati, P., & Saeli, H. (2024). Graduate students’ use of ChatGPT for academic text revision: Behavioral, cognitive, and affective engagement. Journal of Second Language Writing, 65, 101130.

Kyle, K., Crossley, S. A., & Jarvis, S. (2021). Assessing the validity of lexical diversity using direct judgements. Language Assessment Quarterly, 18(2), 154–170.

Lee, U., Jung, H., Jeon, Y., Sohn, Y., Hwang, W., Moon, J., & Kim, H. (2023). Few-shot is enough: exploring ChatGPT prompt engineering method for automatic question generation in English education. Education and Information Technologies, 29, 11483–11515.

Lin, Z., & Chen, H. (2024). Investigating the capability of ChatGPT for generating multiple-choice reading comprehension items. System, 123, 103344.

Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496.

Ma, Q., Crosthwaite, P., Sun, D., & Zou, D. (2024). Exploring ChatGPT literacy in language education: A global perspective and comprehensive approach. Computers and Education: Artificial Intelligence, 7, 100278.

Marshall, D. T., & Naff, D. B. (2024). The ethics of using artificial intelligence in qualitative research. Journal of Empirical Research on Human Research Ethics, 19(3), 92–102.

Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050.

Mizumoto, A., Shintani, N., Sasaki, M., & Teng, M. F. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116.

Mojadeddi, Z., & Rosenberg, J. (2024). Automated transcription of interviews in qualitative research using artificial intelligence: A simple guide. Journal of Surgery Research and Practice, 5, 1–6.

Mollick, E. (2024). Co-intelligence: Living and working with AI. Portfolio/Penguin.

Morgan, D. L. (2023). Exploring the use of artificial intelligence for qualitative data analysis: The case of ChatGPT. International Journal of Qualitative Methods, 22, 16094069231211248.

Paulus, T. M., & Marone, V. (2024). “In minutes instead of weeks”: Discursive constructions of generative AI and qualitative data analysis. Qualitative Inquiry, 31(5).

Pfau, A., Polio, C., & Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083.

Polio, C., &. Yoon, H. J. (2018). The reliability and validity of automated tools for examining variation in syntactic complexity across genres. International Journal of Applied Linguistics, 28, 165–188.

Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking in conversation. Language, 50(4), 696–735.

Sen, M., Sen, S. N., & Sahin, T. G. (2023). A new era for data analysis in qualitative research: ChatGPT!. Shanlax International Journal of Education, 11, 1–15.

Shin, D., & Lee, J. H. (2023). Can ChatGPT make reading comprehension testing items on par with human experts? Language Learning & Technology, 27(3), 27–40. [URL].

Smith, G., Fleisig, E., Bossi, M., Rustagi, I., & Yin, X. (2024). Standard language ideology in AI-generated language. arXiv preprint arXiv:2406.08726.

Spring, R., & Johnson, M. (2022). The possibility of improving automated calculation of measures of lexical richness for EFL writing: A comparison of the LCA, NLTK and SpaCy tools. System, 106.

Tankosić, A., Milak, E., Steele, C., & Dobinson, T. (2024). Meeting standards: (Re)colonial and subversive potential of AI modification. Australian Review of Applied Linguistics, 47(3), 366–382.

Voss, E. (2024). Artificial intelligence and linguistic landscape research: Affordances, challenges & considerations. Linguistic Landscape, 10(4), 400–424.

Wang, P. (2019). On defining artificial intelligence. Journal of Artificial General Intelligence 10(2) 1–37.

Wang, Z., Xie, Q., Feng, Y., Ding, Z., Yang, Z., & Xia, R. (2023). Is ChatGPT a good sentiment analyzer? A preliminary study. arXiv preprint arXiv:2304.04339.

Xu, Y., Zhuang, J., Blair, B., Kim, A., Li, F., Thorson Hernández, R., & Plonsky, L. (2023). Modeling quality and prestige in applied linguistics journals: A bibliometric and synthetic analysis. Studies in Second Language Learning and Teaching, 13(4), 755–779.

Xu, Y., Polio, C., & Pfau, A. (2024). Optimizing AI for assessing L2 writing accuracy: An exploration of temperatures and prompts. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 151–174). Iowa State University Digital Press.

Yu, D., Li, L., Su, H., & Fuoli, M. (2024). Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology. International Journal of Corpus Linguistics, 29(4), 534–561.

Zhang, X., Diaz, A., Chen, Z., Wu, Q., Qian, K., Voss, E., & Yu, Z. (2024). DECOR: Improving coherence in L2 English writing with a novel benchmark for incoherence detection, reasoning, and rewriting. arXiv preprint arXiv:2406.19650.

Chapter 17Generative artificial intelligence tools

Chapter 17
Generative artificial intelligence tools