Article published In: Multidisciplinary Perspectives on Human-AI Team Trust
Edited by Nicolo' Brandizzi, Morgan Elizabeth Bailey, Carolina Centeio Jorge, Myke C. Cohen, Francesco Frattolillo and Alan Richard Wagner
[Interaction Studies 26:2] 2025
► pp. 229–266
Exploring trust in AI-supported military teams using sentiment analysis
Published online: 27 February 2026
https://doi.org/10.1075/is.24046.kuc
https://doi.org/10.1075/is.24046.kuc
Abstract
Examining sentiment in team communications can provide information about trust among teammates. Natural language
processing (NLP) models provide an efficient means of sentiment analysis. However, military teams and other professional teams use
language that differs from what NLP models are trained on, leading to potentially inaccurate sentiment analysis. This study
investigates the novel application of two advanced NLP models, DistilBERT and GPT-2, for sentiment analysis of expert military
teams conducting AI-supported combat missions in a high fidelity simulation environment. Our fine-tuning process resulted in
improved sentiment classification accuracy. The sentiment measures also correlated with measures of team trust and trust in the AI
systems, providing valuable insight into the relationship between sentiment and trust in human-AI teaming scenarios. The
generalized approach we describe may be useful for adapting sentiment analysis and NLP techniques to military teams, and may help
measure trust dynamics and team states in human machine integrated teams.
Article outline
- 1.Introduction
- 1.1Sentiment analysis in military teams
- 1.2Trust in human-machine integrated teams
- 1.3Communication sentiment and trust
- 1.4The current study
- 2.Methodology
- 2.1Dataset description
- 2.2Model development and tuning
- 2.3Transcription and cleaning
- 2.4Data segmentation and annotation
- 2.5Fine-tuning transformer models for sentiment analysis
- 2.6Models comparison of one vs two annotators
- 2.7Conversion to sentiment ratios
- 2.8Trust questionnaires
- 2.8.1Team trust
- 2.8.2AITR and DTAS trust
- 2.9Data analysis
- 2.9.1Team trust
- 2.9.2AITR trust
- 2.9.3DTAS trust
- 2.9.4Statistical analysis
- 3.Results
- 3.1Analysis of team trust levels
- 3.2Analysis of adaptive aided target recognition (AITR) trust
- 3.3Analysis of dynamic task allocation software (DTAS) trust levels
- 3.4Analysis of the relationship between team trust and AITR/DTAS trust
- 3.5Qualitative examination
- 4.Discussion
- 4.1Model tuning for the military context
- 4.2Sentiment and trust
- 4.2.1Trust in teammates
- 4.2.2Trust in DTAS and AiTR
- 4.3Limitations and future work
- 5.Conclusion
- Acknowledgements
References
References (68)
Abdullah, M., Madain, A., & Jararweh, Y. (2022). Chatgpt:
Fundamentals, applications and social impacts. 2022 Ninth International Conference on Social
Networks Analysis, Management and Security (SNAMS), 1–8.
Alarcon, G. M., Lyons, J. B., Hamdan, I. A., & Jessup, S. A. (2024). Affective
responses to trust violations in a human-autonomy teaming context: Humans versus
robots. International Journal of Social
Robotics, 16(1), 23–35.
Alghanmi, I., Anke, L. E., & Schockaert, S. (2020). Combining
bert with static word embeddings for categorizing social media. Proceedings of the sixth
workshop on noisy user-generated text (w-nut 2020), 28–33.
Attota, D. C., & Dehbozorgi, N. (2022). Towards
application of speech analysis in predicting learners’ performance. 2022 IEEE Frontiers in
Education Conference (FIE), 1–5.
Baker, A. L., Fitzhugh, S. M., Huang, L., Forster, D. E., Scharine, A., Neubauer, C., Lematta, G., Bhatti, S., Johnson, C. J., Krausman, A., et al. (2021). Approaches
for assessing communication in human-autonomy teams. Human-Intelligent Systems
Integration, 3(2), 99–128.
Beigi, G., Tang, J., Wang, S., & Liu, H. (2016). Exploiting
emotional information for trust/distrust prediction. Proceedings of the 2016 SIAM international
conference on data mining, 81–89.
Bonta, V., Kumaresh, N., & Janardhan, N. (2019). A
comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of
Computer Science and
Technology, 8(S2), 1–6.
Bose, R., Dey, R. K., Roy, S., & Sarddar, D. (2020). Sentiment
analysis on online product reviews. Information and Communication Technology for Sustainable
Development: Proceedings of ICT4SD 2018, 559–569.
Bray, R. M. (2009). Department
of defense survey of health related behaviors among active duty military personnel: A component of the defense lifestyle
assessment program. Diane Publishing.
Buçinca, Z., Malaya, M. B., & Gajos, K. Z. (2021). To
trust or to think: Cognitive forcing functions can reduce overreliance on ai in ai-assisted
decision-making. Proceedings of the ACM on Human-computer
Interaction, 5(CSCW1), 1–21.
Chen, L.-C., Lee, C.-M., & Chen, M.-Y. (2020). Exploration
of social media for sentiment analysis using deep learning. Soft
Computing, 24(11), 8187–8197.
Chiou, E. K., & Lee, J. D. (2023). Trusting
automation: Designing for responsivity and resilience. Human
factors, 65(1), 137–165.
Cohen, M. C., Demir, M., Chiou, E. K., & Cooke, N. J. (2021). The
dynamics of trust and verbal anthropomorphism in human-autonomy teaming. 2021 IEEE 2nd
international conference on human-machine systems
(ICHMS), 1–6.
Cooke, N. J., & Gorman, J. C. (2009). Interaction-based
measures of cognitive systems. Journal of cognitive engineering and decision
making, 3(1), 27–46.
Corbin, L., Griner, E., Seyedi, S., Jiang, Z., Roberts, K., Boazak, M., Rad, A. B., Clifford, G. D., & Cotes, R. O. (2023). A
comparison of linguistic patterns between individuals with current major depressive disorder, past major depressive disorder,
and controls in a virtual, psychiatric research interview. Journal of Affective Disorders
Reports, 141, 100645.
Costa, A. C., & Anderson, N. (2011). Measuring
trust in teams: Development and validation of a multifaceted measure of formative and reflective indicators of team
trust. European Journal of Work and Organizational
Psychology, 20(1), 119–154.
Costa, A. C., Fulmer, C. A., & Anderson, N. R. (2018). Trust
in work teams: An integrative review, multilevel model, and future directions. Journal of
organizational
behavior, 39(2), 169–184.
Cronbach, L. J. (1951). Coefficient
alpha and the internal structure of
tests. psychometrika, 16(3), 297–334.
Dashtipour, K., Gogate, M., Adeel, A., Larijani, H., & Hussain, A. (2021). Sentiment
analysis of persian movie reviews using deep
learning. Entropy, 23(5), 596.
De Visser, E. J., Pak, R., & Shaw, T. H. (2018). From
‘automation’to ‘autonomy’: The importance of trust repair in human-machine
interaction. Ergonomics, 61(10), 1409–1427.
De Visser, E. J., Peeters, M. M., Jung, M. F., Kohn, S., Shaw, T. H., Pak, R., & Neerincx, M. A. (2020). Towards
a theory of longitudinal trust calibration in human-robot teams. International journal of
social
robotics, 12(2), 459–478.
DeChurch, L. A., & Mesmer-Magnus, J. R. (2010). The
cognitive underpinnings of effective teamwork: A meta-analysis. Journal of applied
psychology, 95(1), 32.
Devlin, J. (2018). Bert:
Pre-training of deep bidirectional transformers for language understanding. arXiv preprint
arXiv:1810.04805.
Dunn, J. R., & Schweitzer, M. E. (2005). Feeling
and believing: The influence of emotion on trust. Journal of personality and social
psychology, 88(5), 736.
Endsley, M. R. (2017). From
here to autonomy: Lessons learned from human-automation research. Human
factors, 59(1), 5–27.
Feitosa, J., Grossman, R., Kramer, W. S., & Salas, E. (2020). Measuring
team trust: A critical and meta-analytical review. Journal of Organizational
Behavior, 41(5), 479–501.
Flood, A., & Keegan, R. J. (2022). Cognitive
resilience to psychological stress in military personnel. Frontiers in
psychology, 131, 809003.
Ghafari, S. M., Beheshti, A., Joshi, A., Paris, C., Yakhchi, S., Jolfaei, A., & Orgun, M. A. (2020). A
dynamic deep trust prediction approach for online social networks. Proceedings of the 18th
international conference on advances in mobile computing &
multimedia, 11–19.
Glikson, E., & Woolley, A. W. (2020). Human
trust in artificial intelligence: Review of empirical research. Academy of Management
Annals, 14(2), 627–660.
Gremillion, G. M., Rexwinkle, J. T., Cox, K. R., Brooks, J. R., Dyer, P., Kucukosmanoglu, M., Giammanco, C. A., Hung, C. P., Napier, S. J., Carter, E. C., Marusich, L. R., Rohaly, T. R., Krausman, A. S., & Perelman, B. S. (2024). Technologies
to cue and support team tasking and coordination in the next generation combat vehicle (summary technical
report) (tech. rep. No. ARL-TR-9963). U.S. Army DEVCOM Army Research Laboratory. Aberdeen Proving Ground, MD.
Gupta, S., Modgil, S., Bhattacharyya, S., & Bose, I. (2022). Artificial
intelligence for decision support systems in the field of operations research: Review and future scope of
research. Annals of Operations
Research, 308(1), 215–274.
Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y., De Visser, E. J., & Parasuraman, R. (2011). A
meta-analysis of factors affecting trust in human-robot interaction. Human
factors, 53(5), 517–527.
Hildebrand, C., & Bergner, A. (2021). Conversational
robo advisors as surrogates of trust: Onboarding experience, firm perception, and consumer financial decision
making. Journal of the Academy of Marketing
Science, 49(4), 659–676.
Hoff, K. A., & Bashir, M. (2015). Trust
in automation: Integrating empirical evidence on factors that influence trust. Human
factors, 57(3), 407–434.
Huang, L., Cooke, N. J., Gutzwiller, R. S., Berman, S., Chiou, E. K., Demir, M., & Zhang, W. (2021). Distributed
dynamic team trust in human, artificial intelligence, and robot
teaming. In Trust in human-robot
interaction (pp. 301–319). Elsevier.
Hutto, C., & Gilbert, E. (2014). Vader:
A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the
international AAAI conference on web and social
media, 8(1), 216–225.
Jean-Baptiste, C. O., Herring, R. P., Beeson, W. L., Dos Santos, H., & Banta, J. E. (2020). Stressful
life events and social capital during the early phase of covid-19 in the us. Social Sciences
& Humanities
Open, 2(1), 100057.
Johnson, C. J., Demir, M., McNeese, N. J., Gorman, J. C., Wolff, A. T., & Cooke, N. J. (2023). The
impact of training on human-autonomy team communications and trust calibration. Human
factors, 65(7), 1554–1570.
Khawaji, A., Chen, F., Marcus, N., & Zhou, J. (2013). Trust
and cooperation in textbased computer-mediated communication. Proceedings of the 25th
Australian Computer-Human Interaction Conference: Augmentation, Application, Innovation,
Collaboration, 37–40.
Lee, J. D., & See, K. A. (2004). Trust
in automation: Designing for appropriate reliance. Human
factors, 46(1), 50–80.
Li, M., Erickson, I. M., Cross, E. V., & Lee, J. D. (2024). It’s
not only what you say, but also how you say it: Machine learning approach to estimate trust from
conversation. Human
Factors, 66(6), 1724–1741.
Liu, Y. (2019). Roberta:
A robustly optimized bert pretraining approach. arXiv preprint
arXiv:1907.11692.
Lottridge, D., Chignell, M., & Jovicic, A. (2011). Affective
interaction: Understanding, evaluating, and designing for human emotion. Reviews of Human
Factors and
Ergonomics, 7(1), 197–217.
Madhavan, P., & Wiegmann, D. A. (2007). Similarities
and differences between humanhuman and human-automation trust: An integrative
review. Theoretical Issues in Ergonomics
Science, 8(4), 277–301.
Mathieu, J. E., Heffner, T. S., Goodwin, G. F., Salas, E., & Cannon-Bowers, J. A. (2000). The
influence of shared mental models on team process and performance. Journal of applied
psychology, 85(2), 273.
McKinney, W., et al. (2011). Pandas:
A foundational python library for data analysis and statistics. Python for high performance and
scientific
computing, 14(9), 1–9.
Muir, B. M., & Moray, N. (1996). Trust
in automation. part ii. experimental studies of trust and human intervention in a process control
simulation. Ergonomics, 39(3), 429–460.
Nguyen-Mau, T., Le, A.-C., Pham, D.-H., & Huynh, V.-N. (2024). An
information fusion based approach to context-based fine-tuning of gpt models. Information
Fusion, 1041, 102202.
Norman, S. M., Avolio, B. J., & Luthans, F. (2010). The
impact of positivity and transparency on trust in leaders and their perceived
effectiveness. The leadership
quarterly, 21(3), 350–364.
Parasuraman, R., & Riley, V. (1997). Humans
and automation: Use, misuse, disuse, abuse. Human
factors, 39(2), 230–253.
Philander, K., & Zhong, Y. (2016). Twitter
sentiment analysis: Capturing sentiment from integrated resort tweets. International Journal of
Hospitality
Management, 551, 16–24.
Pressman, S. D., & Cohen, S. (2012). Positive
emotion word use and longevity in famous deceased psychologists. Health
Psychology, 31(3), 297.
Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2023). Robust
speech recognition via large-scale weak supervision. International conference on machine
learning, 28492–28518.
Rexwinkle, J. T., Gremillion, G. M., Krausman, A. S., Cox, K. R., Brewer, R. W., Giammanco, C. A., Chhan, D., Metcalfe, J. S., Marusich-Cooper, L., Wright, J. L., Holder, E. W., Cesar-Tondreau, B., Smith, T. B., Pollard, K. A., Neubauer, C. E., Lakhmani, S. G., Scharine, A. A., Fitzhugh, S. M., Forster, D. E., . . . Conklin, S. (2024). Adaptive
situation awareness technologies for next generation combat platforms (Technical
Report). DEVCOM Army Research Laboratory; DCS Corp; FIBERTEK; Arizona State University and D-Prime LLC.
Sanh, V. (2019). Distilbert,
a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint
arXiv:1910.01108.
Schoorman, F. D., Mayer, R. C., & Davis, J. H. (2007). An
integrative model of organizational trust: Past, present, and future.
Seabold, S., & Perktold, J. (2010). Statsmodels:
Econometric and statistical modeling with
python. SciPy, 7(1).
Tao, X., Dharmalingam, R., Zhang, J., Zhou, X., Li, L., & Gururajan, R. (2019). Twitter
analysis for depression on social networks based on sentiment and stress. 2019 6th
International Conference on Behavioral, Economic and Socio-Cultural Computing
(BESC), 1–4.
Thielmann, I., & Hilbig, B. E. (2015). Trust:
An integrative review from a person-situation perspective. Review of General
Psychology, 19(3), 249–277.
van Rhenen, J.-W., Centeio Jorge, C., Matej Hrkalovic, T., & Dudzik, B. (2022). Effects
of social behaviours in online video games on team trust. Extended Abstracts of the 2022 Annual
Symposium on Computer-Human Interaction in Play, 159–165.
Van Rossum, G., & Drake, F. L. (2009). Introduction
to python 3: Python documentation manual part
1. CreateSpace.
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al. (2020). Scipy
1.0: Fundamental algorithms for scientific computing in python. Nature
methods, 17(3), 261–272.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al. (2020). Transformers:
State-of-the-art natural language processing. Proceedings of the 2020 conference on empirical
methods in natural language processing: system
demonstrations, 38–45.
