Article published In: Multidisciplinary Perspectives on Human-AI Team Trust
Edited by Nicolo' Brandizzi, Morgan Elizabeth Bailey, Carolina Centeio Jorge, Myke C. Cohen, Francesco Frattolillo and Alan Richard Wagner
[Interaction Studies 26:2] 2025
► pp. 267–297
Exploratory models of human-AI teams
Leveraging human digital twins to investigate trust development
Published online: 27 February 2026
https://doi.org/10.1075/is.24052.ngu
https://doi.org/10.1075/is.24052.ngu
Abstract
As human-agent teaming (HAT) research continues to grow, computational methods for modeling HAT behaviors and
measuring HAT effectiveness also continue to develop. One rising method involves the use of human digital twins (HDT) to
approximate human behaviors and socio-emotional-cognitive reactions to AI-driven agent team members. To help HDT research
effectively model human trust in HATs, we offer two lines of insight. First, through a review of the HAT trust literature, we
identify key characteristics and attributes of trust that must be considered in order to properly conceptualize, model, and
measure trust. Through this review, we outline the theoretical foundations of trust needed for effective HDTs capable of emulating
human trust and offer guidance on where and how extant HAT research should translate into HDT modeling and future research.
Second, through causal analyses of archival team communication data from a HAT experiment, we supplement theoretical foundations
for modeling trust with data-driven insights to guide the trust-related language HDTs may need to effectively emulate human trust.
Finally, we discuss implications of these combined theoretical and empirical insights for future HDT research, highlighting the
necessity of ongoing validation against human behaviors and the refinement of computational methods. This paper ultimately aims to
advance both the fidelity and applicability of HDTs in modeling nuanced human-agent trust dynamics, fostering more effective and
realistic human-agent collaborations.
Article outline
- 1.Introduction
- 2.Theoretical foundations of trust in HATs
- 2.1Defining and conceptualizing trust in HDTATs
- 2.2Critical characteristics of trust for modeling
- Setting HDT initial trust in AI teammates
- Changes in trust over time
- Computational modeling of HDT trust over time
- 2.3Methods for measuring HAT trust
- Self-reported trust
- Behavioral trust
- Emerging techniques for measuring trust
- 3.Empirical examination of HAT trust
- 3.1Data driven insights for modeling HAT trust
- Measuring HAT trust using causal models
- Causal analysis results: Empathy constructs
- Causal analysis results: Socio-cognitive constructs
- Causal analysis results: Emotional constructs
- Causal analysis key takeaways
- 3.2Preliminary modeling of HDT trust
- Preliminary HDT simulation limitations and implications
- 3.3Operational implications for defense applications
- 3.1Data driven insights for modeling HAT trust
- 4.Conclusion
- Contributions to HAT science
- Future directions
References
References (77)
Abdurahman, S., Atari, M., Karimi-Malekabadi, F., Xue, M. J., Trager, J., Park, P. S., Golazizian, P., Omrani, A., & Dehghani, M. (2024). Perils
and opportunities in using large language models in psychological research. PNAS
nexus, 3(7), pgae245.
Abramov, G., Miellet, S., Kautz, J., Grenyer, B. F. S., & Deane, F. P. (2020). The
paradoxical decline and growth of trust as a function of borderline personality disorder trait count: Using discontinuous
growth modelling to examine trust dynamics in response to violation and repair. PloS
One, 15(7), e0236170.
Baker, A. L., Phillips, E. K., Ullman, D., & Keebler, J. R. (2018). Toward
an Understanding of Trust Repair in Human-Robot Interaction: Current Research and Future
Directions. ACM Trans. Interact. Intell.
Syst., 8(4), 301:1–30:30.
Barricelli, B. R., & Fogli, D. (2024). Digital
Twins in Human-Computer Interaction: A Systematic Review. International Journal of
Human-Computer
Interaction, 40(2), 79–97.
Barsade, S. G. (2002). The
Ripple Effect: Emotional Contagion and its Influence on Group Behavior. Administrative Science
Quarterly, 47(4), 644–675.
Cai, W., Jin, Y., & Chen, L. (2022). Impacts
of personal characteristics on user trust in conversational recommender systems. Proceedings of
the 2022 CHI conference on human factors in computing
systems, 1–14.
Chancey, E. T., Bliss, J. P., Yamani, Y., & Handley, H. A. H. (2017). Trust
and the Compliance-Reliance Paradigm: The Effects of Risk, Error Bias, and Reliability on Trust and
Dependence. Human Factors: The Journal of the Human Factors and Ergonomics
Society, 59(3), 333–345.
Chiou, E. K., & Lee, J. D. (2023). Trusting
Automation: Designing for Responsivity and Resilience. Human Factors: The Journal of the Human
Factors and Ergonomics
Society, 65(1), 137–165.
Cohen, M. C., Demir, M., Chiou, E. K., & Cooke, N. J. (2021). The
Dynamics of Trust and Verbal Anthropomorphism in Human-Autonomy Teaming. 2021 IEEE 2nd
International Conference on Human-Machine Systems
(ICHMS), 1–6.
Cohen, M. C., Peel, M. A., Scalia, M. J., Willett, M. M., Chiou, E. K., Gorman, J. C., & Cooke, N. J. (2023). Anthropomorphism
Moderates the Relationships of Dispositional, Perceptual, and Behavioral Trust in a Robot
Teammate. Proceedings of the Human Factors and Ergonomics Society Annual
Meeting, 671, 529–536.
Collins, A. L., Lawrence, S. A., Troth, A. C., & Jordan, P. J. (2013). Group
affective tone: A review and future research directions. Journal of Organizational
Behavior, 34(S1), S43–S62.
de Visser, E. J., Peeters, M. M. M., Jung, M. F., Kohn, S., Shaw, T. H., Pak, R., & Neerincx, M. A. (2020). Towards
a Theory of Longitudinal Trust Calibration in Human-Robot Teams. International Journal of
Social
Robotics, 12(2), 459–478.
DeCastellarnau, A. (2018). A
classification of response scale characteristics that affect data quality: A literature
review. Quality &
Quantity, 52(4), 1523–1559.
Demir, M., McNeese, N. J., Gorman, J. C., Cooke, N. J., Myers, C. W., & Grimm, D. A. (2021). Exploration
of Teammate Trust and Interaction Dynamics in Human-Autonomy Teaming. IEEE Transactions on
Human-Machine
Systems, 51(6), 696–705.
Duan, W., McNeese, N., & Zhang, R. (2023). Communication
in Human-AI Teaming. In Group
Communication. Routledge.
Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G., & Beck, H. P. (2003). The
role of trust in automation reliance. International Journal of Human-Computer
Studies, 58(6), 697–718.
Fan, C., Tariq, Z., Saadiq Bhuiyan, N., Yankoski, M. G., & Ford, T. W. (2024). Comp-husim:
Persistent digital personality simulation platform. Adjunct Proceedings of the 32nd ACM
Conference on User Modeling, Adaptation and
Personalization, 98–101.
Forgas, J. P. (1995). Mood
and judgment: The affect infusion model (AIM). Psychological
Bulletin, 117(1), 39–66.
Garten, J., Boghrati, R., Hoover, J., Johnson, K. M., & Dehghani, M. (2016). Morality
between the lines: Detecting moral sentiment in text. Proceedings of IJCAI 2016 Workshop on
Computational Modeling of Attitudes.
Glenski, M., Ayton, E., Saldanha, E., Mendoza, J., Arendt, D., Shaw, Z., Cronk, K., Smith, S., & Greaves, M. (2021). Machine
intelligence to detect, characterise, and defend against influence operations in the information
environment. Journal of Information
Warfare, 20(2), 42–66.
Glikson, E., & Woolley, A. W. (2020). Human
Trust in Artificial Intelligence: Review of Empirical Research. Academy of Management
Annals, 14(2), 627–660.
Graham, J., Haidt, J., Koleva, S., Motyl, M., Iyer, R., Wojcik, S. P., & Ditto, P. H. (2013). Moral
foundations theory: The pragmatic validity of moral
pluralism. In Advances in experimental social
psychology (pp. 55–130, Vol. 471). Elsevier.
Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y. C., de Visser, E. J., & Parasuraman, R. (2011). A
Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction. Human Factors: The Journal
of the Human Factors and Ergonomics
Society, 53(5), 517–527.
Hanu, L., Thewlis, J., & Haco, S. (2021). How
AI is learning to identify toxic online content. Scientific
American, 81.
Hoff, K. A., & Bashir, M. (2015). Trust
in Automation: Integrating Empirical Evidence on Factors That Influence Trust. Human
Factors, 57(3), 407–434.
Huang, L., Freeman, J., Cooke, N., John “JCR” Colonna-Romano, Wood, M., Buchanan, V., & Caufman, S. (2022). Artificial
Social Intelligence for Successful Teams (ASIST) Study 31.
Jessup, S. A., Schneider, T. R., Alarcon, G. M., Ryan, T. J., & Capiola, A. (2019). The
Measurement of the Propensity to Trust Automation. In J. Y. Chen & G. Fragomeni (Eds.), Virtual,
Augmented and Mixed Reality. Applications and Case
Studies (pp. 476–489). Springer International Publishing.
Jian, J.-Y., Bisantz, A. M., & Drury, C. G. (2000). Foundations
for an Empirically Determined Scale of Trust in Automated Systems. International Journal of
Cognitive
Ergonomics, 4(1), 53–71.
Karpinsky, N. D., Chancey, E. T., Palmer, D. B., & Yamani, Y. (2018). Automation
trust and attention allocation in multitasking workspace. Applied
Ergonomics, 701, 194–201.
Kohn, S. C., de Visser, E. J., Wiese, E., Lee, Y.-C., & Shaw, T. H. (2021). Measurement
of Trust in Automation: A Narrative Review and Reference Guide. Frontiers in
Psychology, 121.
Lee, J. D., & See, K. A. (2004). Trust
in Automation: Designing for Appropriate Reliance. Human
Factors, 46(1), 50–80.
Lee, Y.-J., Lim, C.-G., & Choi, H.-J. (2022, October). Does
GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for
Empathetic Dialogue Generation. In N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, & S.-H. Na (Eds.), Proceedings
of the 29th International Conference on Computational
Linguistics (pp. 669–683). International Committee on Computational Linguistics.
Li, M., Erickson, I. M., Cross, E. V., & Lee, J. D. (2022). Estimating
Trust in Conversational Agent with Lexical and Acoustic Features. Proceedings of the Human
Factors and Ergonomics Society Annual
Meeting, 66(1), 544–548.
(2024). It’s
Not Only What You Say, But Also How You Say It: Machine Learning Approach to Estimate Trust from
Conversation. Human
Factors, 66(6), 1724–1741.
Lin, C.-P., He, H., Baruch, Y., & Ashforth, B. E. (2017). The
Effect of Team Affective Tone on Team Performance: The Roles of Team Identification and Team
Cooperation. Human Resource
Management, 56(6), 931–952.
Luca, J., & Tarricone, P. (2001). Does
emotional intelligence affect successful teamwork? Research Outputs Pre
2011.
Madhavan, P., & Wiegmann, D. A. (2007). Similarities
and differences between human-human and human-automation trust: An integrative
review. Theoretical Issues in Ergonomics
Science, 8(4), 277–301.
Madsen, M., & Gregor, S. (2000). Measuring
human-computer trust. 11th Australasian Conference on Information
Systems, 531, 6–8.
Malle, B. F., & Ullman, D. (2021, January). Chapter
1 — A multidimensional conception and measure of human-robot
trust. In C. S. Nam & J. B. Lyons (Eds.), Trust
in Human-Robot
Interaction (pp. 3–25). Academic Press.
Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An
Integrative Model of Organizational Trust. The Academy of Management
Review, 20(3), 709.
McDuff, D., Schaekermann, M., Tu, T., Palepu, A., Wang, A., Garrison, J., Singhal, K., Sharma, Y., Azizi, S., Kulkarni, K., et al. (2025). Towards
accurate differential diagnosis with large language
models. Nature, 1–7.
McNeese, N. J., Demir, M., Chiou, E. K., & Cooke, N. J. (2021). Trust
and team performance in human-autonomy teaming. International Journal of Electronic
Commerce, 25(1), 51–72.
Merritt, S. M., & Ilgen, D. R. (2008). Not
all trust is created equal: Dispositional and history-based trust in human-automation
interactions. Human
Factors, 50(2), 194–210.
Meyer, J., & Lee, J. D. (2013). Trust,
reliance, and compliance. In The Oxford handbook of cognitive
engineering (pp. 109–124). Oxford University Press.
Miller, M. E., & Spatz, E. (2022). A
unified view of a human digital twin. Human-Intelligent Systems
Integration, 4(1), 23–33.
Mirzadeh, I., Alizadeh, K., Shahrokhi, H., Tuzel, O., Bengio, S., & Farajtabar, M. (2024, October). GSM-Symbolic:
Understanding the Limitations of Mathematical Reasoning in Large Language Models.
National Academies of Sciences, Engineering, and
Medicine. (2024, March). Foundational Research Gaps and
Future Directions for Digital Twins. National Academies Press.
Niraula, D., Cuneo, K. C., Dinov, I. D., Gonzalez, B. D., Jamaluddin, J. B., Jin, J. J., Luo, Y., Matuszak, M. M., Ten Haken, R. K., Bryant, A. K., et al. (2025). Intricacies
of human-ai interaction in dynamic decision-making for precision oncology. Nature
Communications, 16(1), 1138.
Parasuraman, R., & Riley, V. (1997). Humans
and Automation: Use, Misuse, Disuse, Abuse. Human
Factors, 39(2), 230–253.
Park, J. S., O’Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023, August). Generative
Agents: Interactive Simulacra of Human Behavior.
QuantumBlack
Labs. (2020). CausalNex: A python library for causal reasoning with
bayesian networks [Version 0.x].
Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017, September). Truth
of Varying Shades: Analyzing Language in Fake News and Political
Fact-Checking. In M. Palmer, R. Hwa, & S. Riedel (Eds.), Proceedings
of the 2017 Conference on Empirical Methods in Natural Language
Processing (pp. 2931–2937). Association for Computational Linguistics.
Razin, Y. S., & Feigh, K. M. (2023, March). Converging
Measures and an Emergent Model: A Meta-Analysis of Human-Automation Trust Questionnaires.
Reinert, A., Rebensky, S., Osman, M. C., Prebot, B., Gonzalez, C., Morrison, D., Yerdon, V., & Nguyen, D. (2023). Using
cognitive models to develop digital twin synthetic known user persona. Human Factors and
Simulation, 83(83).
Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2020, March). DistilBERT,
a distilled version of BERT: Smaller, faster, cheaper and lighter.
Sato, T., Yamani, Y., Liechty, M., & Chancey, E. T. (2020). Automation
trust increases under high-workload multitasking scenarios involving risk. Cognition,
Technology &
Work, 22(2), 399–407.
Schaefer, K. E., Chen, J. Y. C., Szalma, J. L., & Hancock, P. A. (2016). A
Meta-Analysis of Factors Influencing the Development of Trust in Automation: Implications for Understanding Autonomy in Future
Systems. Human
Factors, 58(3), 377–400.
See, A., Roller, S., Kiela, D., & Weston, J. (2019). What
makes a good conversation? how controllable attributes affect human judgments. arXiv preprint
arXiv:1902.08654.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2001). Experimental
and quasi-experimental designs for generalized causal inference. Houghton Mifflin.
Shengli, W. (2021). Is
human digital twin possible? Computer Methods and Programs in Biomedicine
Update, 11, 100014.
Snow, T. (2021). From
satisficing to artificing: The evolution of administrative decision-making in the age of the
algorithm. Data &
Policy, 31, e3.
Sumers, T., Yao, S., Narasimhan, K., & Griffiths, T. (2023). Cognitive
architectures for language agents. Transactions on Machine Learning
Research.
Textor, C., Zhang, R., Lopez, J., Schelble, B. G., McNeese, N. J., Freeman, G., Pak, R., Tossell, C., & de Visser, E. J. (2022). Exploring
the Relationship Between Ethics and Trust in Human-Artificial Intelligence Teaming: A Mixed Methods
Approach. Journal of Cognitive Engineering and Decision
Making, 16(4), 252–281.
Tu, T., Schaekermann, M., Palepu, A., Saab, K., Freyberg, J., Tanno, R., Wang, A., Li, B., Amin, M., Cheng, Y., et al. (2025). Towards
conversational diagnostic artificial
intelligence. Nature, 1–9.
Volkova, S., Orvis, K., et al. (2024). Compound
ai ecosystem: Agents and tools to improve training and learning. Proceedings of the
Interservice/Industry Training, Simulation and Education Conference.
Widayati, C. C., Arijanto, A., Magita, M., & Septiana, D. (2022). The
Effect of Emotional Intelligence, Teamwork, Organizational Culture and Empathy on Employee
Performance. 4th Social and Humanities Research Symposium (SoRes
2021), 584–588.
Wildman, J. L., Nguyen, D., Thayer, A. L., Robbins-Roth, V. T., Carroll, M., Carmody, K., Ficke, C., Akib, M., & Addis, A. (2024). Trust
in Human-Agent Teams: A Multilevel Perspective and Future Research Agenda. Organizational
Psychology
Review, 14(3), 373–402.
Yaghini, M., Liu, P., Boenisch, F., & Papernot, N. (2024, February). Regulation
Games for Trustworthy Machine Learning.
Yang, X. J., Schemanske, C., & Searle, C. (2023). Toward
Quantifying Trust Dynamics: How People Adjust Their Trust After Moment-to-Moment Interaction With
Automation. Human
Factors, 65(5), 862–878.
Zaharia, M., Khattab, O., Chen, L., Davis, J. Q., Miller, H., Potts, C., Zou, J., Carbin, M., Frankle, J., Rao, N., & Ghodsi, A. (2024). The
shift from models to compound ai systems.
Zheng, X., Aragam, B., Ravikumar, P., & Xing, E. P. (2018). Dags
with no tears: Continuous optimization for structure learning. Advances in Neural Information
Processing
Systems, 311, 9472–9483.
Zhou, X., Zhu, H., Mathur, L., Zhang, R., Qi, Z., Yu, H., Morency, L.-P., Bisk, Y., Fried, D., Neubig, G., & Sap, M. (2024). Sotopia:
Interactive evaluation for social intelligence in language agents. International Conference on
Learning Representations (ICLR). [URL]
