In:Digital and Internet-Based Research Methods in Applied Linguistics
Edited by Matt Kessler
[Research Methods in Applied Linguistics 15] 2026
► pp. 412–432
Chapter 19Transcription tools
Published online: 5 January 2026
https://doi.org/10.1075/rmal.15.19ho
https://doi.org/10.1075/rmal.15.19ho
Abstract
Applied linguistics researchers are increasingly working with a wider range of data, including
audio-recording data, and video-recording data collected in online and internet-based contexts. Transcription of such
multimodal data with digital affordances is an important part of the research process. Moreover, the emergence of
transcription tools that make use of generative artificial intelligence technologies has made available a variety of
web-based transcription tools and apps to choose from, such as Otter.ai and Happy Scribe. This chapter will introduce
readers to different types of transcription and transcription tools, specifically in the context of multimodal
transcription. In addition, several example studies which adopt such tools will be described, along with ethical
considerations related to the use of transcription tools. Finally, directions for future research will be
discussed.
Article outline
- 1.Introduction
- 2.Frequently asked research questions
- 3.Implementation
- Manual transcription
- Automated transcription
- Hybrid transcription
- 4.Example studies
- Satar (2016)
- Helm and Dooly (2017)
- Sindoni (2021)
- Meredith (2016)
- Palomeque and Pujolà (2018)
- Recktenwald (2017)
- 5.Ethics and research integrity considerations
- 6.Challenges and issues
- 7.Future research directions
Notes References
References (41)
Androutsopoulos, J. (2008). Potentials
and limitations of discourse-centered online
ethnography. Language@Internet, 5(9). Retrieved
on 24 September
2025 from [URL][URL]
Ayaß, R. (2015). Doing
data: The status of transcripts in conversation analysis. Discourse
Studies, 17(5), 505–528.
Bezemer, J. (2014). 14.
Multimodal transcription: A case study. In S. Norris & C. D. Maier (Eds.), Interactions,
images and
texts (pp. 155–170). De Gruyter Mouton.
Bezemer, J., & Mavers, D. (2011). Multimodal
transcription as academic practice: A social semiotic
perspective. International Journal of Social Research
Methodology, 14(3), 191–206.
Bokhove, C., & Downey, C. (2018). Automated
generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded
data. Methodological
Innovations, 11(2).
Braun, V., & Clarke, V. (2006). Using
thematic analysis in psychology. Qualitative Research in
Psychology, 3(2), 77–101.
Cowan, K. (2014). Multimodal
transcription of video: Examining interaction in Early Years
classrooms. Classroom
Discourse, 5(1), 6–21.
Cowan, K., & Kress, G. (2017). Documenting
and transferring meaning in the multimodal
world. In F. Serafini & E. Gee (Eds.), Remixing
multiliteracies: Theory and practice from New London to New
Times (pp. 50–61). Teachers College Press.
Crompton, H., Edmett, A., & Ichaporia, N. (2023). Artificial
intelligence and English language teaching: A systematic literature
review. British Council. [URL]
Domingo, M. (2011). Analyzing
layering in textual design: A multimodal approach for examining cultural, linguistic, and social migrations in
digital video. International Journal of Social Research
Methodology, 14(3), 219–230.
Dooly, M., & Hauck, M. (2012). Researching
multimodal communicative competence in video and audio telecollaborative
encounters. In M. Dooly & R. O’Dowd (Eds.), Researching
online interaction and exchange in foreign language
education (pp. 135–162). Peter Lang.
European
Commission. (2025). Living guidelines on the responsible
use of generative AI in research: ERA forum stakeholders’ document. [URL]
GRAPE-MARS (Multimodal Analysis Research
Software). (2023). [Computer
software]. [URL]
Helm, F., & Dooly, M. (2017). Challenges
in transcribing multimodal data: A case study. Language Learning &
Technology, 21(1), 166–185.
Hepburn, A., & Bolden, G. B. (2013). The
conversation analytic approach to
transcription. In J. Sidnell & T. Stivers (eds.), The
handbook of conversation
analysis (pp. 57–77). Wiley-Blackwell.
Ho, W. Y. J., & Li, W. (2019). Mobilizing
learning: A translanguaging view. Chinese Semiotic
Studies, 15(4), 533–559.
Ho, W. Y. J., & Tai, K. W. H. (2021). Translanguaging
in digital learning: The making of translanguaging spaces in online English teaching
videos. International Journal of Bilingual Education and
Bilingualism, 27(9), 1212–1233.
Jefferson, G. (2004). Glossary
of transcript symbols with an introduction. In G. H. Lerner (Ed.), Conversation
analysis: Studies from the first
generation (pp. 13–31). John Benjamins.
Lapadat, J. C. (2000). Problematizing
transcription: Purpose, paradigm and quality. International Journal of Social
Research
Methodology, 3(3), 203–219.
Liu, Y., & Xiang, Y. (2024). Computer-assisted
multimodal analysis research: A technical review of
GRAPE-MARS. Multimodality &
Society, 4(2), 239–246.
Marshall, D. T., & Naff, D. B. (2024). The
ethics of using artificial intelligence in qualitative research. Journal of
Empirical Research on Human Research
Ethics, 19(3), 92–102.
Meredith, J. (2016). Transcribing
screen-capture data: The process of developing a transcription system for multi-modal text-based
data. International Journal of Social Research
Methodology, 19(6), 663–676.
Messina Dahlberg, G., & Bagga-Gupta, S. (2014). Understanding
glocal learning spaces. An empirical study of languaging and transmigrant positions in the virtual
classroom. Learning, Media and
Technology, 39(4), 468–487.
Ochs, E. (1979). Transcription
as theory. In B. B. Schieffelin & E. Ochs (Eds.), Developmental
pragmatics (pp. 43–71). Academic Press.
Palomeque, C., & Pujolà, J.-T. (2018). Managing
multimodal data in virtual world research for language
learning. ReCALL, 30(2), 177–195.
Plowman, L., & Stephen, C. (2008). The
big picture? Video and the representation of interaction. British Educational
Research
Journal, 34(4), 541–565.
Recktenwald, D. (2017). Toward
a transcription and analysis of live streaming on Twitch. Journal of
Pragmatics, 115, 68–81.
Samuel, G., & Wassenaar, D. (2025). Joint
editorial: Informed consent and AI transcription of qualitative data. Journal
of Empirical Research on Human Research
Ethics, 20(1–2), 3–5.
Satar, H. M. (2016). Meaning-making in online language learner interactions via desktop videoconferencing. ReCALL, 28(3), 305–325.
Sindoni, M. G. (2012). Mode-switching: How oral and written modes alternate in videochats. In M. Cambria, C. Arizzi, & F. Coccetta (Eds.), Web genres and web tools: With contributions from the Living Knowledge Project (pp. 141–153). Ibis.
Sindoni, M. G. (2014). Through
the looking glass: A social semiotic and linguistic perspective on the study of video
chats. Text &
Talk, 34(3).
(2021). Mode-switching
in video-mediated interaction: Integrating linguistic phenomena into multimodal transcription
tasks. Linguistics and
Education, 62, 100738.
Sindoni, M. G., & Ho, W. Y. J. (2024). A
translanguaging and multimodal approach to video-mediated ‘street language
learning.’ In R. Hampel & U. Stickler (Eds.), Bloomsbury
handbook of language learning and
technology (pp. 352–363). Bloomsbury.
Tracy, S. J. (2010). Qualitative
quality: Eight “big-tent” criteria for excellent qualitative
research. Qualitative
Inquiry, 16(10), 837–851.
UNESCO. (2021). Recommendation on the ethics of
artificial intelligence. [URL]
YouTube Help. (2025). [URL]
Zhu, H., Dai, D. W., Brandt, A., Chen, G., Ferri, G., Hazel, S., Jenks, C., Jones, R., O’Regan, J., & Suzuki, S. (2025). Exploring
AI for intercultural communication: open conversation. Applied Linguistics
Review. 16(2), 809–824.
Zoom Support. (2025). [URL]
