Review article published In: Digital Translation
Vol. 11:1 (2024) ► pp.73–84
Industry watch
The rise of large language models informed by not so large corpora of training data
This article is available free of charge.
Published online: 14 June 2024
https://doi.org/10.1075/dt.24006.lom
https://doi.org/10.1075/dt.24006.lom
Article outline
- “Magical” English performance may not translate
- Limited training data has real-world effects
- Multiple order-of-magnitude variation increases digital divides
- Unclear criteria for usability cloud plans for improvement
- An existential crisis for languages of lesser diffusion?
- Implications for the language sector
References
References (11)
Bavarian, Mohammad, Angela Jiang, Haewoo Jun, Henrique Pondé. 2022. “New
GPT-3 capabilities: Edit & insert”. OpenAI blog, March 15, 2022. [URL]
Common Crawl. n.d. “Statistics
of Common Crawl Monthly Archives”. [URL]. Accessed February 4, 2024.
Cooper, Kindra. 2023. “OpenAI
GPT-3: Everything You Need to Know [Updated]”. Springboard, September 27, 2023. [URL]
DePalma, Donald A. & Arle Lommel. 2023. “Locales
and Focused Large Language Models.” CSA Research, October 4, 2023. [URL]
Dickson, Ben. 2022. “Three
key takeaways from Meta’s Galactica AI”. TechTalk, November 21, 2022. [URL]
Heaven, Will Douglas. 2023. “The insider story of how
ChatGPT was built from the people who made it”. MIT Technology
Review, March 3, 2023. [URL]
Lai, Viet Dac, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, & Thien Huu Nguyen. 2023. “ChatGPT
Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual
Learning”. [URL]
Lommel, Arle. 2023. “Is
Generative AI’s Translation Output Usable and for What?”. CSA
Research, April 25, 2023. [URL]
Pareesh, Dave. 2023. “ChatGPT
Is Cutting Non-English Languages Out of the AI
Revolution”. Wired, May 31,
2023. [URL]
Thompson, Brian, Mehak Preet Dhaliwal, Peter Frisch, Tobias Domhan, & Marcello Federico. 2024. “A
Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism”. [URL]
Vaughn, Thom. 2023. “November/December
2023 Crawl Archive Now Available”. Common Crawl, December 15, 2023. [URL]
Cited by (2)
Cited by two other publications
Krüger, Ralph
Lundin, Therese
This list is based on CrossRef data as of 8 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
