Investigating cognitive and interpersonal factors in hybrid human-AI practices: An empirical exploration of interlingual respeaking
This article investigates interlingual respeaking (IRSP) as a hybrid practice involving direct interaction between a language professional and speech recognition software to produce live subtitles. This technique combines language transfer with the diamesic shift of subtitling, the immediacy factor of interpreting, and direct interaction with AI-driven technology. It is, therefore, a prime example of Multilectal Mediated Communication which requires language professionals to adapt the procedural skills acquired through training and professional experience in related fields. The article reports on an empirical study that explores cognitive abilities and interpersonal traits as underexamined but critical elements of competence in IRSP, guided by Hambrick et al.’s (2018)Hambrick, David Z., Alexander P. Burgoyne, Brooke N. Macnamara, and Fredrik Ullén 2018 “Towards a Multifactorial Model of Expertise: Beyond Born Versus Made.” Annals of the New York Academy of Sciences 1423 (1): 284–295. multifactorial model of expertise. A mixed-methods exploratory design was used to investigate factors influencing IRSP performance. Fifty-one language professionals participated in a twenty-five-hour upskilling course, completing pre- and post-experiment surveys as well as a battery of cognitive tasks and interpersonal skills scales. Multiple regression analyses were used to investigate predictors of IRSP accuracy and errors. Our study found that, among cognitive variables, complex working memory predicts accuracy. Additionally, different working memory resources were found to be involved in different error categories. In relation to interpersonal traits, integrated regulation (a measure of motivation) and conscientiousness were found to negatively predict accuracy. The study highlights the interplay between human cognitive and interpersonal variables in IRSP as a complex form of human-AI interaction, establishing a baseline for future research and advancing methodological approaches and training frameworks for language professionals working in AI-assisted environments.
Publication history
Table of contents
- Abstract
- Keywords
- 1.Introduction
- 2.Defining the domain: Interlingual respeaking as HAII
- 3.Study design and methodology
- 4.Key findings on cognitive abilities
- 5.Key findings on interpersonal traits
- 6.Discussion
- 1.Introduction
- 2.Defining the domain: Interlingual respeaking as HAII
- 3.Study design and methodology
- 4.Key findings on cognitive abilities
- 5.Key findings on interpersonal traits
- 6.Discussion
- Acknowledgments
- Notes
- Funding
- Acknowledgments
- Notes
- References
- Address for correspondence
1.Introduction
AI-driven technologies are reshaping the landscape of many professional practices, both in terms of how activities are performed and what is required to achieve successful performance. The language industry is a case in point, where speech recognition (SR) and machine translation (MT) have blurred the boundaries of traditional practices, giving rise to increasingly hybrid ones (Davitti 2025Davitti, Elena 2025 “Technology for Hybrid Modalities.” In The Routledge Handbook of Interpreting, Technology and AI, edited by Elena Davitti, Tomasz Korybski, and Sabine Braun, 181–208. London: Routledge. ). This article centres on interlingual respeaking (IRSP) as a form of real-time speech-to-text relying on direct interaction between humans (language professionals) and machines (SR) to produce live subtitles. This practice crosses diamesic (spoken-to-written) and linguistic (source-to-target language) boundaries, introducing a unique form of real-time human-AI interaction (HAII) that adds layers of complexity to established language practices.
Although still emerging in demand and adoption, IRSP holds the potential to provide an inclusive service by broadening access to different types of audiovisual content across languages and diverse abilities. It is also a prime example of Multilectal Mediated Communication (MMC), a process which requires language professionals to adapt skills from related disciplines to a new workflow. Limited evidence-based research exists on this intricate process, which demands a composite skillset. Existing studies have focused mostly on procedural skills. Cognitive abilities and interpersonal skills, crucial for supporting and enhancing performance, have received less empirical validation. This article argues for a clear distinction between different domains of competence and calls for more bottom-up investigation of cognitive and interpersonal factors. Drawing on insights from the SMART project (“Shaping Multilingual Access through Respeaking Technology”; Economic and Social Research Council UK, ES/T002530/1, 2020–2023),11.Project website: https://smartproject.surrey.ac.uk/. a study designed to explore competences, accuracy, and upskilling in IRSP across different languages, directionalities, and content types (see Section 3), this article contributes empirical findings to the currently limited body of research on this topic.
Our study sheds light on some inherently human aspects at play in this complex practice, which may be applicable to other hybrid HAII environments. In an era where AI is perceived as a threat to human employment in several domains, studying how humans operate within highly technologized environments can help us gain a deeper understanding of the intricacies of HAII and human adaptability to these environments. This study underscores the need for a comprehensive HAII framework enriched with considerations such as users’ cognitive limitations and individual characteristics to advance towards people-centred AI.
2.Defining the domain: Interlingual respeaking as HAII
IRSP (also referred to as ‘speech-to-text interpreting’ or ‘transpeaking’) is a method for real-time rendering of a spoken source-language (SL) utterance into written target language (TL). Respeakers listen to live input and simultaneously render it (with added oral punctuation and any relevant content labels, e.g., to indicate speaker change or sound) in a target language to SR software that turns the rendition into written text displayed on screen (Davitti and Sandrelli 2020Davitti, Elena, and Annalisa Sandrelli 2020 “Embracing the Complexity: A Pilot Study on Interlingual Respeaking.” Journal of Audiovisual Translation 3 (2): 103–139. ; Pöchhacker and Remael 2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. ). IRSP is thus a multitasking endeavour that combines language transfer with the diamesic shift normally characteristic of subtitling, the immediacy factor characteristic of interpreting, and the direct interaction with SR technology, which is typical of its predecessor, intralingual respeaking. In IRSP, the language professional is entrusted with the real-time rendering of a source message into a TL, with the awareness that the resulting output will be processed by SR to turn spoken language into written text. This extends the challenges of traditional simultaneous interpreting and the strategic behaviour typically associated with it, as the recipient of the language professional’s output is not a human with the capacity to infer meaning from context, but a machine.
To ensure that the diamesic conversion leads to subtitles that are comprehensible, readable, and displayed with the minimum possible delay, IRSP entails the addition of several tasks, for instance, voicing punctuation orally, articulating, and controlling prosody to optimise recognition (referred to as ‘software-adapted delivery’ — SAD — in the SMART project), monitoring one’s own spoken output and what is displayed by SR on screen (also referred to as ‘audiovisual monitoring’; Pöchhacker and Remael 2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. ), and providing on-the-go editing of the product (live subtitles in the TL).
The production and monitoring phases are thus initiated by the professional but must be adjusted to this specific form of human-AI hybrid (Fabri et al. 2023Fabri, Lukas, Björn Häckel, Anna Maria Oberländer, Marius Rieg, and Alexander Stohr 2023 “Disentangling Human-AI Hybrids.” Business & Information Systems Engineering 65: 623–641. ), where human agents undertake tasks in close interaction with artefact(s). As Pöchhacker and Remael (2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. , 135) put it, “the interface between the [respeaker’s] output and the recognised text is a crucial point in the process: it is evidently shaped by the human agent as much as by the capabilities of the software.” This collaboration surpasses typical HAII, aligning more with human-autonomy teamwork (O’Neill et al. 2022O’Neill, Thomas, Nathan McNeese, Amy Heather Barron, and Beau G. Schelble 2022 “Human–Autonomy Teaming: A Review and Analysis of the Empirical Literature.” Human Factors: The Journal of the Human Factors and Ergonomics Society 64 (5): 904–938. ), where human agents work interdependently with a technological agent towards the shared goal of producing interlingual live subtitling. According to Rai, Constantinides, and Sarker (2019)Rai, Arun, Panos Constantinides, and Saonee Sarker 2019 “Next Generation Digital Platforms: Toward Human-AI Hybrids.” MIS Quarterly 43 (1): iii–ix., human-AI hybrids involve a dynamic combination of individual competences of human agents and AI-enabled systems. Crucially, AI-led technologies often fall short in terms of accuracy and may produce incomplete or improperly formatted output, requiring additional human effort to achieve a satisfactory outcome.
Given the novelty and hybridity of IRSP, several questions arise about what competences underlie and best support performance, how the required tasks are managed cognitively, and the consequent impact on the language professional at the core of the practice. This article focuses on key competences in IRSP as a complex form of HAII, with specific emphasis on cognitive and interpersonal factors that have received limited attention to date.
2.1Interlingual respeaking competences
The hybrid nature of IRSP inherently requires multiple skills at different levels. A small yet expanding body of literature has sought to identify and classify key components of competence to prepare future language professionals for this practice.
Pöchhacker and Remael (2019)Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. proposed a competence model mapped against a tripartite structure of the IRSP process (that they name ‘transpeaking’), namely pre-, peri- and post-process stages (see Figure 1). It draws on a broad definition of competence encompassing different types of resources: “declarative knowledge (knowing what), procedural knowledge or skills (knowing how), and socio-psychological resources, such as having the willingness and ability to work in a team” (2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. , 138). This model integrates technical-methodological competences (i.e., procedural, related to the IRSP task), linguistic and cultural competences, world and subject-matter knowledge, and interpersonal and professional skills. As acknowledged by the authors, this model is “largely hypothetical” (2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. , 141). However, it is based on a comprehensive understanding of IRSP mechanics, in relation to shared features with established practices like interpreting and subtitling. It has also been found to align empirically with experimental studies offering bottom-up evidence to ground the model.
Dawson (2019Dawson, Hayley 2019 “Feasibility, Quality and Assessment of Interlingual Live Subtitling.” Journal of Audiovisual Translation 2 (2): 36–56. , 2020 2020 “A Research-Informed Training Course for Interlingual Respeaking.” Journal of Audiovisual Translation 3 (2): 204–225. ; Dawson and Romero-Fresco 2021Dawson, Hayley, and Pablo Romero-Fresco 2021 “Towards Research-Informed Training in Interlingual Respeaking: An Empirical Approach.” The Interpreter and Translator Trainer 15 (1): 66–84. ) identified and examined empirically a wide range of task-specific skills, which they organised according to their origin (i.e., subtitling, interpreting, intralingual respeaking), and to different stages in the respeaking process. The top five task-specific skills are identified as multitasking, live translation, dictation (and punctuation), language, and source language comprehension. The four skills on the far right in Figure 2 are described as ongoing skills required during more than one stage of IRSP.
These models categorise skills into IRSP stages, with the majority falling under procedural skills. Despite the varied terminology, task-specific and technical-methodological skills effectively refer to the same set of domain-specific skills needed in the peri-process. What seems to emerge less clearly is how cognitive and interpersonal skills relate to procedural ones. In relation to cognitive skills, Dawson’s model lists short-term and long-term memory as peri-process and ongoing skills, respectively. Interpersonal skills are listed in both models, and described by Pöchhacker and Remael as “socio-psychological traits and requirements” (2019 2020 “A Research-Informed Training Course for Interlingual Respeaking.” Journal of Audiovisual Translation 3 (2): 204–225. , 141), including teamwork ability. However, there is no detailed breakdown or categorisation provided for these skill groups, which remain posited rather than empirically tested. In the SMART project, we thus decided to investigate some of them bottom-up, to enrich current models that have primarily focused on domain-specific determinants of competence.
Our project acknowledges the importance of capitalising on the achievements of existing models and seeks to engage with them from the perspectives of cognitive psychology and HAII, fostering discussion between different disciplines. To this end, we embrace Hambrick et al.’s (2018)Hambrick, David Z., Alexander P. Burgoyne, Brooke N. Macnamara, and Fredrik Ullén 2018 “Towards a Multifactorial Model of Expertise: Beyond Born Versus Made.” Annals of the New York Academy of Sciences 1423 (1): 284–295. view, advocating for a multifactorial model to achieve a more nuanced understanding of the factors contributing to the development of competence and, ultimately, expertise. Their approach is based on the idea that ‘deliberate practice’ alone (Ericsson, Krampe, and Tesch-Römer 1993Ericsson, K. Anders, Ralf T. Krampe, and Clemens Tesch-Römer 1993 “The Role of Deliberate Practice in the Acquisition of Expert Performance.” Psychological Review 100 (3): 363–406. , 397), intended as relevant activities through which “individuals acquire virtually all of the distinguishing characteristics of expert performers,” is not the only key predictor of expert performance. The authors rely on a multifactorial gene-environment interaction model of expertise, according to which both domain-general traits and domain-specific knowledge/skills need to be accounted for as they may impact performance directly. To this end, several explanatory constructs should be considered, such as different forms of experience, human traits, and task and situational factors (e.g., task complexity and performance pressure). As Hambrick et al. (2016Hambrick, David Z., Brooke N. Macnamara, Guillermo Campitelli, Fredrik Ullén, Miriam A. Mosing 2016 “Beyond Born Versus Made: A New Look at Expertise.” Psychology of Learning and Motivation 64: 1–55. , 45) put it, “expertise is, at its core, a multiply determined phenomenon whose richness and complexity can never be adequately understood by focusing on one, or one class, of determinants or by using one methodological approach.”
Looking at IRSP through the lens of HAII, an approach considering human factors is needed to enrich our understanding of multitasking environments, where a “two-dimensional framework between the level of automation and human control may not be sufficient in capturing the complexities of how people use AI systems” (Pacailler et al. 2022Pacailler, Matthew, Sarah Yahoodik, Tetsuya Sato, Jeremiah G. Ammons, and Jeremiah Still 2022 “Human-Centered Artificial Intelligence: Beyond a Two-Dimensional Framework.” In Lecture Notes in Computer Science, edited by Jessie Y. C. Chen, Gino Fragomeni, Helmut Degen, and Stavruola Ntoa, 471–482. Springer., 479). As suggested by these authors, including users’ cognitive limitations is key since interacting with AI can reduce performance in other tasks due to limited attentional resources. In the context of IRSP, the integration of SR opens up challenges for the human interacting with it. For instance, it requires a different type of monitoring and distribution of cognitive abilities and attention which may lead to changes in language professionals’ habits and strategic behaviour, but also in their attitude towards such new practices. In addition, the “ubiquity of AI means that people with a variety of attitudes, experiences, and characteristics will be interacting with these systems” (Pacailler et al. 2022Pacailler, Matthew, Sarah Yahoodik, Tetsuya Sato, Jeremiah G. Ammons, and Jeremiah Still 2022 “Human-Centered Artificial Intelligence: Beyond a Two-Dimensional Framework.” In Lecture Notes in Computer Science, edited by Jessie Y. C. Chen, Gino Fragomeni, Helmut Degen, and Stavruola Ntoa, 471–482. Springer., 476). Given the hybrid nature of IRSP, which transcends professional boundaries, individuals from different, yet relevant, backgrounds could engage with this emerging practice. Here, hybridity is not conceptualised as a static condition with predetermined, rigid roles but as an ongoing, dynamic process that interweaves human biological and cognitive activities and pre-existing skills with the abilities of other entities.
It is therefore crucial to consider individual user characteristics in modern frameworks striving to embrace a genuinely human-centric approach to HAII. Empirical analysis is urgently needed to understand what human variables underlie performance and create a baseline for future studies. Building on these premises, the SMART project adopted an exploratory approach to identify empirically competences that can support the human-in-the-loop when performing IRSP.
2.2The SMART competence model
The SMART project expands on the set of skills identified by existing research, dividing them into four major groups. Given our conceptualisation of competence as a broad collection of related skills, abilities, and knowledge that enable a person to perform a task effectively, we posit the four groups to include not only procedural skills and declarative knowledge, but also cognitive abilities and interpersonal traits, since they are critical factors playing a role in acquisition and performance. A key tenet is that some elements can be trained while others are pre-existing and linked to individual differences that may affect an individual’s suitability for the task.
Declarative knowledge “comprises all of the concepts and cognitions that people can potentially draw upon for use in comprehending new information or making judgements and behavioural decisions” (Wyer 2022Wyer, Robert S. 2022 “The Activation and Use of Declarative and Procedural Knowledge.” In APA Handbook of Consumer Psychology, edited by Lynn R. Kahle, Tina M. Lowrey, and Joel Huber, 47–78. American Psychological Association. , 47). It is the knowledge forming the necessary basis for “comprehending new information, inferences, and behavioural decisions, whereas procedural knowledge refers to the cognitive processes that operate on this content in order to generate these inferences and decisions” (48). Beyond the linguistic and cultural knowledge and world-general (e.g., current affairs) and subject-specific (e.g., domain of a specific assignment) knowledge categories put forward by Pöchhacker and Remael (2019)Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. , two additional sub-categories of declarative knowledge are relevant for IRSP. First, IRSP task and process knowledge; that is, a full conceptual understanding of IRSP as a speech-to-text interlingual process involving human and SR within a specific communicative setting. Second, professional knowledge, which resonates with what Pöchhacker and Remael (2019)Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. refer to as “professional competence.” This ranges from “compliance with an employer’s relevant guidelines and procedures to networking and marketing skills for freelancers and to continuing professional development, not least regarding accessibility and digital technologies” (Pöchhacker and Remael 2019Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. , 141).
In line with previous studies, procedural skills refer to the practical knowing how involved in executing IRSP, also referred to as technical-methodological and task-specific skills. These skills can be taught. Traditionally, approaches to training have introduced IRSP after modules covering intralingual respeaking, simultaneous interpreting, and pre-recorded subtitling as distinct entities. However, IRSP involves a unique combination of skills that do not align precisely with those required for other practices and need adaptation. SMART divided the core IRSP procedural components into building blocks to be acquired progressively and incrementally. These included: (1) software management (preparing and optimising SR software); (2) live translation (listening comprehension, analysis, strategic reformulation in the same and a different language); (3) skills to interact with SR successfully, such as SAD, a term coined to identify required changes to the way one speaks in terms of articulation, pace, intonation, pausing, and chunking; (4) verbalisation of punctuation and addition of relevant auditory information (e.g., speaker change); and (5) error correction via different methods. We propose that these skills operate within a multitasking and audiovisual monitoring framework, underpinned by pre-existing cognitive abilities but adaptable through targeted training.
In line with Hambrick et al. (2016)Hambrick, David Z., Brooke N. Macnamara, Guillermo Campitelli, Fredrik Ullén, Miriam A. Mosing 2016 “Beyond Born Versus Made: A New Look at Expertise.” Psychology of Learning and Motivation 64: 1–55. , our multifactorial model of IRSP competence includes domain-general factors, defined as underlying characteristics inherent to the individuals, which may “predispose people in different ways to sustain high levels of performance for extended periods of time” (Ericsson, Krampe, and Tesch-Römer 1993Ericsson, K. Anders, Ralf T. Krampe, and Clemens Tesch-Römer 1993 “The Role of Deliberate Practice in the Acquisition of Expert Performance.” Psychological Review 100 (3): 363–406. ) and, ultimately, have an impact on the development of expertise. Cognitive abilities are typically assumed to be fairly stable over time, innate to an individual, organised in domains of functioning, hierarchically structured, and interlinked. There is “ample evidence that cognitive abilities positively predict individual differences in complex task performance early in training […] and beyond” (Ackerman 2014Ackerman, Phillip L. 2014 “Facts Are Stubborn Things.” Intelligence 45 (1): 104–106. , 104). With IRSP being a multitasking form of HAII requiring enhanced cognitive performance, these abilities may influence skill acquisition and performance and, in turn, be influenced by training.
In translation-related practices, “the previously underestimated importance of generic cognitive competences requires more attention” (Bernardini et al. 2019Bernardini, Silvia, Pierrette Bouillon, Dragos Ciobanu, Josef van Genabith, Silvia Hansen-Schirra, Sharon O’Brien, Erich Steiner, and Elke Teich 2019 “Language Service Provision in the 21st Century: Challenges, Opportunities and Educational Perspectives for Translation Studies.” In Bologna Process Beyond 2020: Fundamental Values of the EHEA, edited by Sijbolt Noorda, Peter Scott, and Martina Vukasovic, 297–303. Bologna: Bologna University Press., 300). Pöchhacker and Remael (2019)Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. explored the cognitive architecture of IRSP theoretically, adapting Gile’s Effort Model (2015Gile, Daniel 2015 “Effort Models.” In Routledge Encyclopedia of Interpreting Studies, edited by Franz Pöchhacker, 135–137. London: Routledge., 2023 2023 “The Effort Models and Gravitational Model: Clarifications and Update.” ), initially developed for interpreting. They explicitly link the concept of available processing capacity to working memory (WM) (Baddeley and Hitch 1974Baddeley, Alan D., and Graham J. Hitch 1974 “Working Memory.” In The Psychology of Learning and Motivation, edited by Gordon H. Bower, Volume 8, 47–89. ), which is involved in both information storage and processing, and executive functions (EF) that are used to coordinate and control other cognitive behaviours. In SMART, we also posited the theoretical suitability of WM as a construct, as language professionals involved in IRSP strategically reformulate the message in the TL to conform to subtitling requirements and in a manner adapted to aid SR.
Baddeley and Hitch (1974)Baddeley, Alan D., and Graham J. Hitch 1974 “Working Memory.” In The Psychology of Learning and Motivation, edited by Gordon H. Bower, Volume 8, 47–89. proposed a central executive system (CE) as part of WM to direct the mechanisms and processes that control, regulate, and actively maintain task-relevant information. The SMART project builds on Miyake et al.’s (2000)Miyake, Akira, Naomi P. Friedman, Michael J. Emerson, Alexander H. Witzki, Amy Howerter, and Tor D. Wager 2000 “The Unity and Diversity of Executive Functions and Their Contributions to Complex ‘Frontal Lobe’ Tasks: A Latent Variable Analysis.” Cognitive Psychology 41 (1): 49–100. EF framework, which identified these components: shifting, updating, and inhibition. EF comprise distinct but interrelated processes that have an underlying common mechanism (i.e., unity and diversity). Together, these explain how the CE performs EF. Shifting refers to the ability to switch rapidly between tasks, operations or mental sets, requiring attentional control. In IRSP, shifting between tasks is one way to conceive of ‘multitasking’. Rather than a number of simultaneous processes, it can be thought of as rapidly shifting between parts of the process such as switching between languages (i.e., drawing on SL for comprehension and then TL for dictation) and between auditory monitoring of the source message, visuo-spatial monitoring the output (on-screen text) for errors, and updating the output if required. Updating is the second component of EF. It extends beyond the simple maintenance of task relevant information, to include the dynamic manipulation of WM to allow for information to be, for instance, reformulated — another essential aspect of the IRSP process (Pöchhacker and Remael 2019). Inhibition is the final component of EF which suppresses unwanted automatic responses (Miyake et al. 2000Miyake, Akira, Naomi P. Friedman, Michael J. Emerson, Alexander H. Witzki, Amy Howerter, and Tor D. Wager 2000 “The Unity and Diversity of Executive Functions and Their Contributions to Complex ‘Frontal Lobe’ Tasks: A Latent Variable Analysis.” Cognitive Psychology 41 (1): 49–100. ). For example, an interpreter must inhibit SL vocabulary and syntax to prevent it from interfering with TL production. Equally, an interlingual respeaker’s natural prosody needs to be inhibited to allow for SAD required by the interaction with SR. Consequently, these elements were tested empirically in SMART (see Section 3 and 4).
Interpersonal traits “refer to stable or consistent patterns of behaviour that are relatively immune to situational contingencies” (Zaccaro 2007Zaccaro, Stephen J. 2007 “Trait-Based Perspectives of Leadership.” American Psychologist 62 (1): 6–16. , 7). These differentiate individuals from each other and are labels for a collection of associated attributes. Considering that previous studies on IRSP have not extensively delineated interpersonal traits, the SMART project built upon existing knowledge primarily within interpreting to identify several potentially relevant traits for empirical exploration. Earlier studies discussed key traits such as the interpreter’s ability to work as a team member, deal with job-induced stress, remain calm under pressure, and exhibit ‘nerves of steel’ (Henderson 1980Henderson, John A. 1980 “Siblings Observed.” Babel 26 (4): 217–225. ). More recent accounts have highlighted several important personality traits supporting the work of interpreters. For instance, Bontempo and Napier (2011)Bontempo, Karen, and Jemina Napier 2011 “Evaluating Emotional Stability as a Predictor of Interpreter Competence and Aptitude for Interpreting.” Interpreting 13 (1): 85–105. confirmed emotional stability as the most significant predictor of self-perceived competence amongst 110 sign language interpreters in Australia. Self-esteem, openness to experience, and conscientiousness were also found as the strongest indicators of sign language interpreters’ competence (Bontempo et al. 2014Bontempo, Karen, Jemina Napier, Laurence Hayes, and Vicki Brashear 2014 “Does Personality Matter? An International Study of Sign Language Interpreter Disposition.” The International Journal of Translation and Interpreting Research 6 (1): 23–46. ). Student interpreters scored significantly higher than students of multilingual communication and student translators on the social initiative and emotional stability dimensions in Rosiers and Eyckmans (2017)Rosiers, Alexandra, and June Eyckmans 2017 “Birds of a Feather? A Comparison of the Personality Profiles of Aspiring Interpreters and Other Language Experts.” Across Languages and Cultures 18 (1): 29–51. . Hiltunen, Mäntyranta, and Määttänen (2019)Hiltunen, Sinikka, Heli Mäntyranta, and Ilmari Määttänen 2019 “Cooperativeness — A Necessary Trait for Interpreters? A Study on Temperament and Character Dimensions of Experts in Different Fields.” International Journal of Bilingualism 23 (6): 1385–1393. found that compared to control groups of foreign language teachers and non-linguistic experts, simultaneous and consecutive interpreters tend to exhibit higher levels of co-operativeness. Additionally, personality, WM span, and EF (inhibition, updating, and shifting) strongly predicted the choice of language professionals’ career path (Rosiers, Plevoets, and Eyckmans 2020Rosiers, Alexandra, Koen Plevoets, and June Eyckmans 2020 “Choosing to Become an Interpreter: A Matter of Personality and Memory Capacity.” Translation, Cognition & Behavior 3 (1): 25–50. ). Finally, an additional study indicated that personality hardiness also predicts interpreting performance (Xing and Zeng 2022Xing, Xing, and Hua Zeng 2022 “Exploring the Effects of Personality Hardiness on Interpreters’ Performance with Interpreting Anxiety as a Mediator: An Explanatory Sequential Mixed-Methods Study.” Across Languages and Cultures 23 (2): 187–205. ).
In conclusion, it is important to acknowledge that these factors cannot be assumed to be independent, and their combination may have a moderating effect on performance. Given the dearth of literature on cognitive abilities and interpersonal traits in IRSP, the SMART project took an exploratory approach to provide insights into potential predictors of accuracy and skill acquisition.
3.Study design and methodology
SMART’s exploratory and experimental approach aimed to provide empirical grounding to our understanding of this novel practice and how language professionals operate within it. To this end, the project adopted a mixed-method, within-subject design to collect and triangulate quantitative and qualitative data. As mentioned in Section 1, the study pursued a comprehensive examination of three IRSP-related dimensions: ‘competences’, namely which human variables underlie performance and what challenges arise during the IRSP process; ‘accuracy’, particularly what predicts output accuracy and how different individual and content characteristics can have an impact on it; and ‘upskilling’, and how it can be optimised for language professionals from different backgrounds.
Figure 3 shows our threefold categorisation of competences into procedural skills, cognitive abilities, and interpersonal traits, aligning these categories with the data collection methods employed during the experiment. Connecting lines link each category to the key specific methodological tools used to investigate them. While findings related to accuracy and upskilling are discussed in separate papers (see Korybski and Davitti 2024Korybski, Tomasz, and Elena Davitti 2024 “Human Agency in Live Subtitling Through Respeaking: Towards a Taxonomy of Effective Editing.” Journal of Audiovisual Translation 7 (2): 1–22. ; Davitti 2025Davitti, Elena 2025 “Technology for Hybrid Modalities.” In The Routledge Handbook of Interpreting, Technology and AI, edited by Elena Davitti, Tomasz Korybski, and Sabine Braun, 181–208. London: Routledge. ; Davitti et al. 2025Davitti, Elena, Tomasz Korybski, Constantin Orăsan, and Sabine Braun 2025 “Quality-Related Aspects.” In The Routledge Handbook of Interpreting, Technology and AI, edited by Elena Davitti, Tomasz Korybski, and Sabine Braun, 305–326. London: Routledge. ; Davitti et al. under review), this section offers a comprehensive overview of the design adopted to explore IRSP. This design can serve as a blueprint, either in full on in part, for the replication and future investigation of similar avenues (for further details, see Davitti and Wallinheimo 2024Davitti, Elena, and Anna-Stiina Wallinheimo 2024 Shaping Multilingual Access through Respeaking Technology, Project Data, 2021. UK Data Service. SN: 856687. ).
The experiment was organised in different sequential stages, which replicate some components of the SMART project’s pilot study (Davitti and Sandrelli 2020Davitti, Elena, and Annalisa Sandrelli 2020 “Embracing the Complexity: A Pilot Study on Interlingual Respeaking.” Journal of Audiovisual Translation 3 (2): 103–139. ). To begin, participants filled in an Eligibility Survey (in Qualtrics — see Davitti and Wallinheimo 2024Davitti, Elena, and Anna-Stiina Wallinheimo 2024 Shaping Multilingual Access through Respeaking Technology, Project Data, 2021. UK Data Service. SN: 856687. ) to obtain informed consent; assess whether any exclusion criteria applied; and gather information on demographics, language proficiency, qualifications, experience in relevant fields, skills, and training history. This was done for thorough background profiling, as well as determining expectations and reasons for joining the study.
Each participant then had an introduction to the experiment over Zoom with a cognitive psychologist, during which their verbal fluency skills were assessed to measure their verbal ability. After this, participants were required to complete a battery of online psychological tests to collect data on relevant cognitive abilities and interpersonal traits. The data obtained was analysed to identify potential statistically significant patterns, establishing an initial baseline for future, hypothesis-driven investigations. This included five areas of cognition: shifting skills (Plus-Minus Task), simple WM (Digit Span Task), complex WM (Reading Span Task), WM processing (N-back Task), and sustained attention (Sustained Attention to Response Task). Additionally, eight scales were administered to measure interpersonal traits deemed potentially relevant to IRSP, namely trait anxiety, resilience, impulsivity, cognitive flexibility, innovativeness in information technology, personality, work motivation, and mindfulness. For further details, see Table 1. The full battery took approximately sixty minutes to complete.
Participants were given a link to the study, which seamlessly guided them through all the cognitive tasks (in Pavlovia) and interpersonal scales (in Qualtrics) — for both, see Davitti and Wallinheimo (2024)Davitti, Elena, and Anna-Stiina Wallinheimo 2024 Shaping Multilingual Access through Respeaking Technology, Project Data, 2021. UK Data Service. SN: 856687. . The cognitive tasks were administered as shown in Table 1 and arranged so that attentionally demanding tasks alternated with scales to ease pressure on participants. Tasks were integrated to ensure a smooth flow, automatically transitioning participants from one task to the next. Participants underwent the full battery before the IRSP course. Following the course, a selected set of cognitive abilities were measured again, namely complex WM via Reading Span Task, shifting skills via Plus-Minus Task, and sustained attention via Sustained Attention to Response Task to evaluate the impact of training on these functions.
| Cognitive abilities | |||
|---|---|---|---|
| Task name | Short description of measure | Task no. | Platform |
| Verbal Fluency Task | Measure of verbal ability. Participants are given 1 min. to produce as many unique words as possible within a semantic category (category fluency, e.g., animals) or starting with a given letter (letter fluency, e.g., F). | 0 | Zoom |
| Plus-Minus Task | Measure of shifting skills by using simple mathematical equations. Participants start with adding, move into subtraction, and finish with a task where they alternate between addition and subtraction. A switching cost is calculated to see how well the participants alternate between two different types of calculations (i.e., addition and subtraction). | 2 | Qualtrics |
| Digit Span Task (DST) | Measure of simple WM. WM temporarily stores and maintains information that is required for the successful completion of cognitive tasks. Unlike the complex WM measure that measures both processing and storage of WM, simple WM focuses on WM storage only. Participants do forward span (attention) and backward span (memory) and then must recall in ascending order. Individual scores were used for these three different areas of digit span. | 4 | Pavlovia |
| N-back Task | Measure of WM processing (including monitoring, updating, and processing of information). Participants were instructed to monitor a series of stimuli and to respond whenever a stimulus is presented that is the same as the one presented n trials previously. | 7 | Pavlovia |
| Sustained Attention to Response Task (SART) | Measure of sustained attention (i.e., ability to focus over time). In this computer-based go/no go task, participants are required to make a response every time they see a number (1–9) by pressing a key, except when that number is 3, in which case they must withhold their response. | 10 | Pavlovia |
| Reading Span Task (RST) | Measure of complex working memory. It combines a processing component (judging the correctness of a sentence) and a storage component (memorising a series of words for a recall). Only the storage component is reported. The processing component ensures that the participant pays attention to the required task. | 12 | Pavlovia |
| Interpersonal traits | |||
| Scale name | Short description of measure | Task no | Platform |
| State-Trait Anxiety Inventory (STAI) | Measure of state and trait anxiety (Bieling et al. 1998). Only trait component was used. | 1 | Qualtrics |
| Brief Resilience Scale (BRS) | Measure of the ability to bounce back or recover from stress. | 3 | Qualtrics |
| Barratt Impulsiveness Scale (BIS) | Measure of impulsive or non-impulsive behaviours and preferences. Questions represent three subdivisions: A — Attention impulsivity, M — Motor impulsivity, and NP — Non-planning impulsivity. | 5 | Qualtrics |
| Cognitive Flexibility Scale (CFS) | Measure of individual’s cognitive flexibility. | 6 | Qualtrics |
| Personal Innovativeness in IT Scale (PIIT) | Measure of individual’s willingness to engage with Information Technology (Lopez-Bonilla and Lopez-Bonilla 2012). | 8 | Qualtrics |
| Ten Item Personality Indicators (TIPI) | Brief inventory for the Big Five personality traits: Extraversion, Agreeableness, Conscientiousness, Emotional stability, and Openness to new experiences. | 9 | Qualtrics |
| The Work Extrinsic and Intrinsic Motivation Scale (WEIMS) | Measure of work motivation. Includes six subscales: Intrinsic motivation, Integrated regulation, Identified regulation, Introjected regulation, External regulation, and Amotivation. | 11 | Qualtrics |
| Five Facet Mindfulness Questionnaire (FFMQ) | Measure of five facets of mindfulness (Bohlmeijer et al. 2011): Non-reacting (detaching), Observe, Act aware (acting mindfully), Describe, and Non-judge. Short version was used. | 13 | Qualtrics |
After completing the battery, participants were given access to the twenty-five-hour prototype upskilling course (‘training-for-testing’), delivered over five weeks (see Davitti et al. under review). This course broke down the IRSP technique into core procedural skills (see Section 2.2), taught progressively via a blending and scaffolding approach. A follow-up project SMART-UP22.Project website: https://www.surrey.ac.uk/research-projects/smart-shaping-multilingual-access-through-respeaking-technology-upskilling. (“Shaping Multilingual Access through Respeaking Technology — Upskilling”; ESRC Impact Acceleration Account, 2023–2025) is currently refining this prototype course into a fully-fledged, adaptable, and customisable continuous professional development model.
Due to the pandemic, the course was self-taught and online, hosted on Moodle. To ensure design rigour, participants progressed systematically through each activity in the same order. Completion of each training block was a prerequisite to starting the next. To explore skill acquisition, a modified version of the NASA Task Load Index (Hart and Staveland 1988Hart, Sandra G., and Lowell E. Staveland 1988 “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” Advances in Psychology 52: 139–183. ) was incorporated at eight course milestones as ‘reflection points’. This involved six Likert-style questions rating mental and physical demands of the activities conducted until then, followed by open-ended comments. Upon completion of the course, participants underwent testing in IRSP with different scenarios, purposefully created to pose different challenges: speed (i.e., monologic script with controlled faster pace, around 140 wpm), planned/unplanned delivery (i.e., monologic script divided into 4 parts alternating between planned delivery with more spontaneous sections), and multiple speakers (i.e., dialogic script with quick exchanges and partial overlaps). All clips revolved around the same themes, and participants received briefs and terminology for personal and software preparation in advance. The speeches catered for all language directionalities of the course (see Section 3.1). Comparability was ensured by rendering the same English script into the other languages and adapting them for flow, idiomaticity, and terminological consistency. During testing, delivery was randomised to prevent practice effects.
Screencast technology was integrated into Moodle to record participants’ performance during the course and final testing. Analysis was conducted using a purpose-made grid (described in Davitti and Sandrelli 2020Davitti, Elena, and Annalisa Sandrelli 2020 “Embracing the Complexity: A Pilot Study on Interlingual Respeaking.” Journal of Audiovisual Translation 3 (2): 103–139. ), that enabled detailed accuracy evaluation using the NTR model (Romero-Fresco and Pöchhacker 2017Romero-Fresco, Pablo, and Franz Pöchhacker 2017 “Quality Assessment in Interlingual Live Subtitling: The NTR Model.” Linguistica Antverpiensia 16: 149–167.). This model identifies different error types, categorised into software-related (recognition) and human translation errors (namely content-related, i.e., omissions, additions, and substitutions; and form-related, i.e., style and correctness). Errors are penalised based on their severity, with minor errors incurring a −0.25-point deduction as minimally impacting comprehension; major errors receiving a −0.5-point deduction due to the confusion and information loss they may cause; and critical errors resulting in a −1-point deduction as they may introduce false and misleading information, which is challenging for viewers to detect. ‘Effective editions’, or “deviations from the source text that do not involve a loss of information or that even enhance the communicative effectiveness of the subtitles” (Romero-Fresco and Pöchhacker 2017Romero-Fresco, Pablo, and Franz Pöchhacker 2017 “Quality Assessment in Interlingual Live Subtitling: The NTR Model.” Linguistica Antverpiensia 16: 149–167., 159; see also Korybski and Davitti 2024Korybski, Tomasz, and Elena Davitti 2024 “Human Agency in Live Subtitling Through Respeaking: Towards a Taxonomy of Effective Editing.” Journal of Audiovisual Translation 7 (2): 1–22. ), are acknowledged in the model, but not accounted for in the score calculation. Different error types were manually identified by two independent reviewers per language pair and used to calculate the final accuracy score.
Although an accuracy benchmark has not been validated yet for IRSP (cf. 98% for intralingual respeaking), the percentages obtained allowed assessment of participants’ performance after twenty-five hours of upskilling. Additionally, they were used in multiple regressions to identify accuracy predictors across all participants and scenarios and within specific subgroups. To complement quantitative analysis, at the end of each testing session, participants were asked to review their performance screencast using a retrospective Think-Aloud Protocol.
Finally, an Evaluation Survey (in Qualtrics) collected information about participants’ overall satisfaction with their performance and training. From a technical standpoint, multiple platforms were integrated to facilitate seamless execution of all stages of the design and gather both quantitative and qualitative data. Thorough piloting ensured participants’ engagement and retention during the five-week course, which was repeated for four rounds.
3.1Participants
Fifty-one language professionals selected from 250+ applications took part in this study. They came from eleven countries (UK, Spain, Italy, France, Germany, Belgium, Australia, Argentina, New Zealand, USA, Peru). There were eight males (Mage = 37.38 years, SD = 10.93 years) and forty-three females (Mage = 40.63 years, SD = 11.51 years), and their age ranged between twenty-three and sixty-five (Mage = 40.12 years, SD = 10.97 years).
Participants’ recruitment through the Eligibility Survey was based on working languages (English and Italian, French, and/or Spanish mother tongue); professional experience (minimally 2000 hours in at least one relevant language profession); and technical equipment specifications (suitable laptop/PC, headset to take part in the online course). Six main language directions (English into French/Italian/Spanish and vice versa) were covered, with an equal spread of seventeen participants working between English and each of the Romance languages. Participants had professional experience in one or more of the following MMC practices: consecutive interpreting (58.82%); simultaneous interpreting (52.94%); written translation (94.12%); pre-recorded subtitling (58.82%); and live subtitling (21.57%). Most participants had a composite background (i.e., experience in more than one language-related practice among the ones above): twenty-six included three professions in their cluster, followed by thirteen participants with two, nine with four, two with one, and one with five. This reflects the reality of the language industry, where freelancers in particular offer more than one service as part of their portfolio. It also results in different skillsets to rely upon when initiating the acquisition of IRSP. Such complexity needs to be considered when profiling participants. In terms of experience (understood as hours devoted to each profession at the time of the study), the picture was also quite varied, with aggregate professional experience ranging between 2000 and over 40 000 hours.
The rationale for recruiting participants amongst language professionals with relevant professional backgrounds is the scarcity of existing IRSP-trained professionals to date. While representing only a first step in this type of research, the sample supports the assumption that the hybrid nature of this practice attracts people from different MMC practices with a diverse set of knowledge, abilities, traits, and procedural skills, but who need to adjust, unlearn, or acquire other skills. This reality suggests that there is an appetite for further upskilling and diversification in the language industry.
4.Key findings on cognitive abilities
Our multiple regression model (Model 1) for cognitive predictors of accuracy (dependent variable — DV) included simple and complex WM, shifting skills, and sustained attention at baseline (independent variables — IVs). The average performance accuracy was used, established at 95.37%. As reported in Wallinheimo, Evans, and Davitti (2023)Wallinheimo, Anna-Stiina, Simon L. Evans, and Elena Davitti 2023 “Training in New Forms of Human-AI Interaction Improves Complex Working Memory and Switching Skills of Language Professionals.” Frontiers in Artificial Intelligence 6. , a positive association was found between complex WM and accuracy (β = .32, p = .01). The regression model was significant (F(7, 42) = 2.27, p = .04) and explained 15.4% of the variance. None of the other cognitive abilities had any significant effect (p > .05).
Additional analyses were conducted in relation to different levels of performance accuracy. Median split was used to establish higher (n = 25) and lower performers (n = 26). ANOVA results confirmed a significant difference in average performance accuracy between the groups (M higher = .97, SE = .002; M lower = .94, SE = .002; F(1, 49) = 88.02, p < .001, ηp 2 = .64). Crucially, when looking at the cognitive areas, only complex WM was significantly higher for the higher performers (M = .86, SE = .04) compared to the lower performers (M = .76, SE = .04; F(1, 49) = 3.96, p = .05, ηp 2 = .08). The other cognitive abilities were not significant factors (as p > .05).
These findings highlight how complex WM, involved in the simultaneous storage and processing of information, was the sole significant predictor of high accuracy, underscoring its importance as a prime cognitive ability in IRSP, and offering empirical support for Pöchhacker and Remael’s (2019)Pöchhacker, Franz, and Aline Remael 2019 “New Efforts? A Competence-Oriented Task Analysis of Interlingual Live Subtitling.” Linguistica Antverpiensia 18: 130–143. theoretical model.
Working in high-pressure environments can adversely affect cognitive performance. This is particularly evident in IRSP, where time constraints are a prominent part of the process. We therefore wanted to further understand which cognitive functions predicted the key categories of errors made by the language professionals during IRSP performance (see Section 3). To determine these key categories a regression model (Model 2) with the six different error types as IVs and accuracy as the DV was tested. The model was significant (F(6, 44) = 56.73, p < .001) and explained 87% of the variance. Out of the six different error types, we found that omission (β = −.1.1, p < .001), recognition (β = −.31, p < .001), substitution (β = −.19, p < .001), and style (β = −.16, p = .03) errors negatively predict accuracy, with omissions being the strongest predictor. We then focused solely on these error categories, revealing that different error types seem to involve different types of WM resources. The multiple regression model (Model 3) that included all the three areas of the Digit Span Task as IVs and omissions as a DV was significant (F(1, 48) = 7.46, p = .009), explaining 12% of the variance. Only backward span was a significant predictor, showing a negative association with omissions (t(49) = −2.73, β = −.37, p = .009). Backward span, which is measured as part of the Digit Span Task, requires participants to recall digits in the reverse order of how they were initially presented. There is no simultaneous processing element as part of this task, as backward span is based on WM storage component only (Conway et al. 2005Conway, Andrew R. A., Michael J. Kane, Michael F. Bunting, D. Zach Hambrick, Oliver Wilhelm, and Randall W. Engle 2005 “Working Memory Span Tasks: A Methodological Review and User’s Guide.” Psychonomic Bulletin & Review 12 (5): 769–786. ). In IRSP, omission errors occur when the respeaker, intentionally or unintentionally, drops content for various reasons, including not managing to keep up with the speaker’s speech rate, mishearing or not hearing the source content, becoming distracted, or losing concentration (for instance due to focusing on other aspects of the IRSP process). Our findings show a negative association between simple WM and omissions, suggesting a role for higher WM storage capacity in predicting reduced omission rates.
Subsequently, the influence of the N-back, which measures monitoring, updating, and processing of WM, was tested on additions (Model 4), correctness (Model 5), and recognition errors (Model 6). Model 4 was significant (F(2, 47) = 5.24, p = .009) and showed a negative relationship between the N-back and additions (t(49) = −3.0, β = −.41, p = .004). Model 5 was also significant (F(1, 48) = 12.62, p < .001) and revealed a negative association between N-back and correctness (t(50) = −3.55, β = −.46, p = .001). Finally, Model 6 was significant (F(2, 47) = 3.40, p = .04) showing that N-back also predicted recognition errors (t(50) = −2.03, β = −.28, p = .05).
N-back requires participants to maintain, update, and process information while minimising the WM storage component (Morales et al. 2015Morales, Julia, Francisca Padilla, Carlos J. Gómez-Ariza, and M. Teresa Bajo 2015 “Simultaneous Interpretation Selectively Influences Working Memory and Attentional Networks.” Acta Psychologica 155: 82–91. ). Regarding additions (where unnecessary, confusing, or erroneous information is added to the target text) and correctness errors (pertaining to form-related issues like poor use of grammar or punctuation), the findings suggest that higher WM resources facilitating simultaneous information maintenance, updating and processing, can contribute to enhanced performance accuracy, reducing the incidence of such errors among language professionals.
Recognition errors stem from SR alone or from human–SR interaction, due to respeakers’ hesitations, poor articulation, or delivery speed affecting input and leading to subsequent software misrecognition. As explained in Section 2.2, SAD is recommended to enhance recognition. However, performing and monitoring SAD in addition to all other activities involved in IRSP can strain limited WM resources, especially processing ones. The findings suggest that higher WM resources (measured with N-back) may enable more effective processing of SAD, ultimately reducing recognition errors.
The SMART project also found a post-training improvement in two cognitive abilities, namely complex WM (F(1, 46) = 4.0, p = .05, ηp 2 = .08; M pre = .83, SE pre = .02; M post = .88, SE post = .02) and shifting skills (F(1, 49) = 6.42, p = .02, ηp 2 = .12; M pre = 22.90 s, SE pre = 2.95 s; M post = 14.55 s, SE post = 1.85 s) (Wallinheimo, Evans, and Davitti 2023Wallinheimo, Anna-Stiina, Simon L. Evans, and Elena Davitti 2023 “Training in New Forms of Human-AI Interaction Improves Complex Working Memory and Switching Skills of Language Professionals.” Frontiers in Artificial Intelligence 6. ). These findings indicate that the course improved these skills and that they possibly underly the IRSP process. Linking these findings to the predictors above, it can be argued that targeted training enhances relevant WM resources, thereby reducing the incidence of specific errors and improving performance accuracy.
5.Key findings on interpersonal traits
To examine the influence of interpersonal traits on accuracy, a regression model (Model 7) with the interpersonal traits as IVs and accuracy as the DV was tested. The model was statistically significant (F(2, 48) = 4.50, p = .02), explaining 12% of the overall variance. However, integrated regulation and conscientiousness were the only statistically significant predictors of accuracy for the whole group and across all scenarios.
Using the Work Extrinsic and Intrinsic Motivation Scale (WEIMS), which is a measure of work motivation, our study found a negative relationship between integrated regulation (IV) and accuracy (DV) (t(50) = −2.12, β = −.28, p = .04). Integrated regulation, classified within the framework of self-determination theory as a highly self-determined and most autonomous form of extrinsic motivation that fosters positive learning attitudes and outcomes (Deci and Ryan 1985Deci, Edward L., and Richard M. Ryan 1985 Intrinsic Motivation and Self-Determination in Human Behavior. New York: Plenum. ; Ryan and Deci 2000Ryan, Richard M., and Edward L. Deci 2000 “Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions.” Contemporary Educational Psychology 25 (1): 54–67. ; Howard et al. 2021Howard, Joshua L., Julien S. Bureau, Frédéric Guay, Jane X. Y. Chong, and Richard M. Ryan 2021 “Student Motivation and Associated Outcomes: A Meta-Analysis From Self-Determination Theory.” Perspectives on Psychological Science 16 (6): 1300–1323. ), occurs when an individual has fully integrated a particular motivation and places significant value on the outcomes of a specific behaviour. It shares characteristics with intrinsic motivation but is classified as an extrinsic type because the person recognises the value and benefits of a certain action without necessarily finding it enjoyable. Additionally, integrated regulation reflects an individual’s desire to be involved in a specific activity for its perceived contribution to developing an enhanced sense of self and presumed instrumental value with respect to a particular outcome (Ryan and Deci 2000Ryan, Richard M., and Edward L. Deci 2000 “Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions.” Contemporary Educational Psychology 25 (1): 54–67. ).
This seems to be in line with our participants’ key motivations for joining the course, as highlighted in their answers to the Eligibility Survey. Primarily, they exhibited a keen interest in skills development and professional growth, indicating a strong value placed on IRSP as a means of capitalising on their existing background knowledge and “integrating it with new skills,” thus enhancing their career prospects. Participants expressed their desire to “expand my skillset” and “broaden the spectrum of my knowledge and skills” in order to potentially “offer [IRSP] as a professional in the future” and explore opportunities for “new work (well paid!).” Moreover, participants expressed their eagerness to “learn a new technology and way of working” and adapt to a media landscape that is rapidly changing by being “aware of the novelties and… ready for the new challenges that might arise.” As secondary motivations, some wanted to contribute to “research and training in the field” and accessibility, recognising the “innovative nature and increasing demand of this technique.” Numerous remarks also indicated that participants’ decision to take the course was underpinned by the desire “to see if I would be capable of performing IRSP,” suggesting awareness of its complexity but also desire to engage, which may have been heightened by the professional nature of the sample.
A highly motivated workforce is normally associated with higher performance and greater engagement (Ryan and Deci 2000Ryan, Richard M., and Edward L. Deci 2000 “Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions.” Contemporary Educational Psychology 25 (1): 54–67. ). According to Howard et al. (2021)Howard, Joshua L., Julien S. Bureau, Frédéric Guay, Jane X. Y. Chong, and Richard M. Ryan 2021 “Student Motivation and Associated Outcomes: A Meta-Analysis From Self-Determination Theory.” Perspectives on Psychological Science 16 (6): 1300–1323. , the more autonomous a motivation type is, the more positively it relates to adaptive outcomes, such as increased academic success, improved work performance, and overall well-being. In contrast, it relates negatively to maladaptive outcomes, such as anxiety and worry. Our study deviates from these findings as integrated regulation was found to negatively predict performance accuracy.
The highly complex, time-sensitive, and cognitively challenging nature of the practice may have introduced an additional layer of pressure for the individual, thus potentially affecting their cognitive abilities and, ultimately, their task performance. Organismic Integration Theory (Deci and Ryan 1985Deci, Edward L., and Richard M. Ryan 1985 Intrinsic Motivation and Self-Determination in Human Behavior. New York: Plenum. ) posits that extrinsic motivation depends on the extent to which autonomy is present. In other words, people need to feel that they have full control of their own behaviour. This may be constrained in IRSP due to the partial reliance of this process on technological tools, which may make it difficult for the individual to fully internalise the value of the task, contributing to the observed negative relationship between integrated regulation and accuracy. The process of individual transformation from external regulation to one’s own autonomous way of working may thus be more difficult to achieve in this newly-created human-AI environment. With more training and practice, it is anticipated that IRSP performance would also improve, perhaps leading to a positive correlation later down the line. This hypothesis requires further empirical testing. Finally, whilst the degree of self-determination and autonomy is important in relation to work motivation, different motivational types may relate to different outcomes beyond their level of autonomy too (Howard et al. 2021Howard, Joshua L., Julien S. Bureau, Frédéric Guay, Jane X. Y. Chong, and Richard M. Ryan 2021 “Student Motivation and Associated Outcomes: A Meta-Analysis From Self-Determination Theory.” Perspectives on Psychological Science 16 (6): 1300–1323. ). Future studies should investigate these ideas further.
Conscientiousness (as part of TIPI) (IV) was also found to negatively predict performance accuracy (DV) (t(50) = −2.39, β = −.32, p = .02). As a personality trait (part of the Big Five), conscientiousness refers to being careful or diligent; it is a desire to do a task well and it is normally associated with high performance (Thompson 2008Thompson, Edmund R. 2008 “Development and Validation of an International English Big-Five Mini-Markers.” Personality and Individual Differences 45 (6): 542–548. ). Our findings, however, suggest that individuals who scored higher on the conscientiousness scale may have more difficulty maintaining performance accuracy. Previous research has suggested that people on the high end of conscientiousness may be at risk of perfectionism and might perform poorly under stressful conditions as they strive for flawless performance and set unrealistic standards (Coleman, Furnham, and Treglown 2023Coleman, Geoff, Adrian Furnham, and Luke Treglown 2023 “Exploring the Dark Side of Conscientiousness. The Relationship Between Conscientiousness and Its Potential Derailers: Perfectionism and Narcissism.” Current Psychology 42 (31): 27744–27757. ). Perfectionists who worry about being negatively evaluated by others rather than being intrinsically motivated to achieve high goals, spend a lot of their time on worrying about others, leading to lower cognitive performance (Mattes, Mück, and Stahl 2022Mattes, André, Markus Mück, and Jutta Stahl 2022 “Perfectionism-related Variations in Error Processing in a Task with Increased Response Selection Complexity.” Personality Neuroscience 5: e12. ). Higher conscientiousness levels may predispose individuals to overthink and fixate on details, while individuals who score lower on conscientiousness may be more flexible and adaptable, allowing them to adjust more quickly to unexpected challenges that may arise during the process. As a cognitively demanding, time-sensitive process performed under pressure, IRSP might be too stressful a process for the respeaker to be highly conscientious at the same time. In line with these findings, a meta-analysis by Wilmot and Ones (2019)Wilmot, Michael P., and Deniz S. Ones 2019 “A Century of Research on Conscientiousness at Work.” Proceedings of the National Academy of Sciences 116 (46): 23004–23010. highlighted that high occupational complexity versus low-to-moderate occupational complexity attenuates the effect of conscientiousness on performance. This was also highlighted in some comments from the Evaluation Survey, where participants felt they had to quickly learn not to “be a perfectionist” and “accept that [they were] going to miss things” when performing IRSP and adapt their approach, as evident in the following comment: “I started focusing on getting the gist not every nuance, as I was struggling to edit and thought this would improve my overall accuracy.”
Despite being a highly human-centric form of HAII, where humans can intervene in the output via editing, they must relinquish part of their autonomy to SR. This can lead to increased frustration when errors occur, as noted by some participants: “When the tech went wrong, I could feel and hear the frustration in my voice.” For individuals high in conscientiousness, this aspect linked to the HAII nature of the task may exacerbate the negative relationship with accuracy and require an attitude adjustment as a coping mechanism, as evident in the following comments: “I tried to relax before the task because I become frustrated with the mistakes if not” and “as the course progressed, I tried not to get so tense […], take deep breaths and go into it alert but relaxed enough to not get frustrated at things I didn’t expect.” The negative correlations examined here may, in part, be linked to early stages of skill development, as participants had only completed twenty-five hours of upskilling at the time of testing. This suggests that awareness of these underlying mechanisms and coping strategies could improve over time. However, further investigation through a longitudinal study would be needed to explore this aspect more comprehensively.
In addition to the analysis across all scenarios reported above, we also looked at interpersonal predictors of accuracy in relation to the speed scenario, which yielded the lowest average accuracy score of 94.76%, compared to multiple speakers (95.51%) and planned/unplanned scenarios (95.83%) (F(2, 150) = 5.04, p = .008, ηp 2 = .06). As Beilock (2011)Beilock, Sian 2011 Choke — The Secret to Performing Under Pressure. London: Constable. notes, some individuals ‘choke under pressure’, leading to suboptimal performance. The speed testing scenario amplified the already cognitively-demanding nature of IRSP, leading to possible performance deficits.
The regression model (Model 8) including the interpersonal traits as IVs and speed accuracy as a DV was significant (F(1, 48) = 4.71, p = .04) and explained 7% of the variance. Emotional stability emerged as a significant predictor of accuracy in this scenario (t(49) = 2.07, β = .29, p = .04). Emotional stability, which can be seen as the opposite of neuroticism, indicates that an individual stays even-tempered even when possible challenges or threats are experienced. In contrast, emotional instability is associated with feelings of anxiety and lack of self-confidence (Ellis et al. 2018Ellis, Lee, Anthony Hoskin, and Malini Ratnasingam 2018 Handbook of Social Status Correlates. Amsterdam: Elsevier.). Our findings suggest that staying calm particularly under increased time pressure has a positive relationship with accuracy. This supports previous research that has identified emotional stability as one of the most important predictors of successful job performance (Judge and Bono 2001Judge, Timothy A., and Joyce E. Bono 2001 “Relationship of Core Self-evaluations Traits — Self-esteem, Generalized Self-efficacy, Locus of Control, and Emotional Stability — with Job Satisfaction and Job Performance: A Meta-analysis.” Journal of Applied Psychology 86 (1): 80–92. ; Wati Halim et al. 2011Wati Halim, Fatima, Arifin Zainal, Rozainee Khairudin, Wan Shahrazad, Wan Sulaiman, Rohany Nasir, and Fatimah Omar 2011 “Emotional Stability and Conscientiousness as Predictors Towards Job Performance.” Pertanika Journal of Social Science and Humanities 19: 139–145.).
6.Discussion
This article has delved into the emerging and hybrid domain of IRSP, which represents a novel form of HAII. Our investigation is underpinned by the premise that “human agents and AI-enabled systems [are] separate entities with distinct characteristics that globally intra-act in human-AI hybrids” (Fabri et al. 2023Fabri, Lukas, Björn Häckel, Anna Maria Oberländer, Marius Rieg, and Alexander Stohr 2023 “Disentangling Human-AI Hybrids.” Business & Information Systems Engineering 65: 623–641. , 636), where intra-action refers to a “mutual constitution of entangled agencies” (Barad 2007Barad, Karen 2007 Meeting the Universe Halfway: Quantum Physics and the Entanglement of Matter and Meaning. Durham: Duke University Press. , 33). More specifically, IRSP entails a nuanced entanglement that goes beyond simply delegating some tasks to AI but requires collaboration between human and machine. This resembles what Fabri et al. (2023Fabri, Lukas, Björn Häckel, Anna Maria Oberländer, Marius Rieg, and Alexander Stohr 2023 “Disentangling Human-AI Hybrids.” Business & Information Systems Engineering 65: 623–641. , 636) have articulated as “flexible co-evolution,” characterised by a “symbiotic collaboration” where “both [human and AI entities] continuously learn from each other.” From a human perspective, this requires using and, possibly, adapting a wide range of skills, not only procedural but also cognitive and interpersonal to navigate the real-time demands of this task. However, the inherently human aspects at play in IRSP have not yet been widely explored. By empirically probing cognitive abilities and interpersonal traits, our study sheds some initial light on uncharted territories within this domain. Being situated within the framework of a wider research project, this article also lays the groundwork for a multifaceted study design that accommodates a nuanced exploration of these dimensions and can be fully or partially replicated in the future to enrich the discourse on HAII.
From a cognitive point of view, integrating AI tools with an individual’s internal resources establishes an extended cognitive system. Through our examination of cognitive predictors of IRSP accuracy, the study empirically underscored the pivotal role played by complex WM as an underlying ability supporting humans in this complex task. An innovative aspect of the article lies in its examination of how WM contributes to mitigating specific error types that impact accuracy, as well as the varied WM resources implicated in different error categories. Through pre-/post-training measurements, the study further highlighted the adaptability of WM and shifting abilities through targeted training, underscoring the importance of conducting longitudinal investigations in the future.
The article also argues that performance in HAII hinges on human attitudes towards AI and the balance of cooperation between these two entities, leading to the exploration of interpersonal traits as predictors of performance accuracy. In this environment, the integration of technology requires that individuals surrender some autonomy to let AI execute certain tasks (e.g., diamesic shift), while still maintaining agency and the ability to correct errors in the output. However, the inherent real-time nature of IRSP exacerbates these challenges, often hindering the human’s ability to maintain control in a time-constrained and multitasking environment. The unexpected negative correlation between traits typically associated with successful performance (i.e., integrated regulation and conscientiousness) and accuracy can be at least partially attributed to the specificities of this HAII environment. Additionally, the significance of emotional stability in scenarios involving heightened speed has become apparent. Future upskilling in these practices should integrate awareness and discussion of such traits and their potential effect on performance.
Our preliminary findings lay the groundwork for further investigation into the intricate relationship between humans and AI within specialised professional practices. While pioneering the integration of cognitive abilities and interpersonal traits in the study of IRSP within the SMART framework, we acknowledge certain limitations that also open avenues for future research. First, our study does not offer a comprehensive understanding of the subject. Given its exploratory nature, our design encompasses a broad range of cognitive abilities and interpersonal traits that, based on literature in related disciplines (e.g., simultaneous interpreting), we deemed potentially relevant to IRSP. On the one hand, this comprehensive approach has allowed us to investigate a wide array of potential factors underlying IRSP performance. On the other hand, a lack of statistical significance in relation to several variables included in the model does not imply that such variables play no role in IRSP accuracy. Due to space limitations, we do not delve into the reasons for these non-significant findings. However, our research can serve as a baseline for future hypothesis-driven studies focusing on specific constructs to further refine theoretical models of IRSP performance and our understanding of the complex set of relationships at play.
Second, we acknowledge that Hambrick’s model (2016) is tailored for expertise, while our study involves language professionals who are experts in their respective fields but are just in the process of acquiring IRSP competence, given the novelty of the practice. However, we build upon this model because it provides a valuable framework for systematically categorising different elements of competence. This has inspired the development of our SMART model of competence, allowing for a structured approach to understanding the multifaceted aspects of skills, knowledge, abilities, and traits within our study context. The next step involves designing methods to explore the interplay of these various elements, and how they moderate performance.
To conclude, in today’s technology-dominated era, it is imperative to study how humans interact with technology and how they feel about it, as these factors will inevitably influence their performance. As machines are increasingly expected to assume control, understanding how humans navigate these environments presents an added value to our understanding of how well they can do and how they can embrace these environments, in a truly human-centric way.
Funding
Open Access publication of this article was funded through a Transformative Agreement with University of Surrey.
Acknowledgments
This article contains references to analytical data co-created with former SMART project team (see https://smartproject.surrey.ac.uk/smart-team/) members, whose collaborative efforts significantly contributed to our research.