Article published In: Multidisciplinary Perspectives on Human-AI Team Trust
Edited by Nicolo' Brandizzi, Morgan Elizabeth Bailey, Carolina Centeio Jorge, Myke C. Cohen, Francesco Frattolillo and Alan Richard Wagner
[Interaction Studies 26:2] 2025
► pp. 326–356
Perceived trustworthiness and moral competence of a GenAI-enabled ethical robot advisor
Published online: 27 February 2026
https://doi.org/10.1075/is.25072.mom
https://doi.org/10.1075/is.25072.mom
Abstract
Generative AI agents (GenAIs) powered by Large-language models (LLMs) have emerged as prominent technological advancements. As these sophisticated systems permeate diverse sectors ranging from business to entertainment, their capability to handle moral queries becomes a focal point of exploration. This study investigates how users perceive Delphi, a GenAI trained to respond to moral queries (Jiang, L., Hwang, J. D., Bhagavatula, C., Bras, R. L., Liang, J. T., Levine, S., Dodge, J., Sakaguchi, K., Forbes, M., & Hessel, J., et al. (2025). Investigating machine moral judgement through the delphi experiment, Nature Machine Intelligence 71, 145–160. ). Participants were instructed to interact with the agent, implemented either as a humanlike robot or a web client, to assess its moral competence and trustworthiness. Both agents received high scores for moral competence and perceived morality, yet fell short by not offering justifications for their moral decisions. Despite being deemed trustworthy, participants were hesitant about relying on such systems in the future. This study offers an initial evaluation of an algorithm with moral competence in an embodied human-like interface, paving the way for the evolution of ethical robot advisors.
Keywords: trust, trustworthiness, moral competence, human-robot interaction
Article outline
- 1.Introduction
- 2.Literature background
- 2.1Improving moral competence in AI
- 2.2Trustworthiness and trust of a morally compentent AI
- 2.3The current study
- 3.Methods
- 3.1Participants
- 3.2Design and conditions
- 3.2.1Delphi ethical database
- 3.2.2Disembodied interface
- 3.2.3Embodied robot
- 3.3Measures
- 3.3.1Agreement
- 3.3.2Moral classifications of participant ethical queries
- 3.3.3Toxicity scores
- 3.3.4Perceived moral competence
- 3.3.5Perceived moral agency
- 3.3.6Perceived trustworthiness
- 3.3.7Trust
- 3.3.8Mind perception
- 3.3.9Open response questions
- 3.4Procedure
- 4.Results
- 4.1Moral nature of queries
- 4.1.1Agreement
- 4.1.2Moral foundations theory classification
- 4.1.3Purity sub-dimensions
- 4.1.4Impersonal/personal distinction
- 4.1.5Proscription/prescription distinction
- 4.1.6Toxicity scores
- 4.2Mind perception
- 4.3Perceived moral agency and moral competence
- 4.4Perceived trustworthiness, trust and intention to rely
- 4.5Trust
- 4.6Relationship between trust, moral competence and agreement
- 4.7Open response data
- 4.1Moral nature of queries
- 5.Discussion
- 5.1Summary
- 5.2Implications for trust theory
- 5.3Social activation induced by Humanlike embodiment
- 5.4Implications for generating justifications
- 6.Conclusion
- Acknowledgements
References
References (81)
Amazon Polly. (2022). [URL].
Baldwin, M. W., Carrell, S. E., & Lopez, D. F. (1990). Priming relationship schemas: My advisor and the pope are watching me from the back of my mind, Journal of Experimental Social Psychology 261, 435–454.
Banks, J. (2019). A perceived moral agency scale: Development and validation of a metric for humans and social machines, Computers in Human Behavior 901, 363–371.
Bhat, S., Lyons, J. B., Shi, C., & Yang, X. J. (2024). Evaluating the impact of personalized value alignment in human-robot interaction: Insights into trust and team performance outcomes, in: Proceedings of the 2024 ACM/IEEE international conference on human-robot interaction, pp. 32–41.
Bigman, Y. E., & Gray, K. (2018). People are averse to machines making moral decisions, Cognition 1811, 21–34.
Borders, J., Leung, A., & Condon, M. (2025). A framework for identifying key decision-maker attributes in uncertain and complex environments, in: 2025 IEEE Conference on Artificial Intelligence (CAI), IEEE, pp. 1–5.
Botzer, N., Gu, S., & Weninger, T. (2022). Analysis of moral judgment on reddit, IEEE Transactions on Computational Social Systems 1–11.
Breazeal, C., Kidd, C. D., Thomaz, A. L., Hoffman, G., & Berlin, M. (2005). Effects of nonver-bal communication on efficiency and robustness in human-robot teamwork, in: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 708–713.
Card, D., & Smith, N. A. (2020). On consequentialism and fairness, Frontiers in Artificial Intelligence 31.
Caselli, T., Basile, V., Mitrovic, J., & Granitzer, M. (2021). Hatebert: Retraining BERT for abusive language detection in English, arXiv preprint arXiv:2010.12472.
Coleman, C., Neuman, W. R., Dasdan, A., Ali, S., & Shah, M. (2025). The convergent ethics of AI? Analyzing moral foundation priorities in large language models with a multi-framework approach, arXiv preprint arXiv:2504.19255.
Defense Science Board, Task force report: The role of autonomy in DoD systems, Office of the Under Secretary of Defense for Acquisition, Technology and Logistics, 2019.
Dietvorst, B. J., Simmons, J. P., & Massey, C. (2018). Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them, Management Science 641, 1155–1170.
DiSalvo, C. F. (2002). All Robots Are Not Created Equal: The Design and Perception of Humanoid Robot Heads, Technical Report, Ph.D. thesis, MIT.
Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: a three-factor theory of anthropomorphism, Psychological review 1141, 864.
Fabre, E. F., Mouratille, D., Bonnemains, V., Palmiotti, G. P., & Causse, M. (2024). Making moral decisions with artificial agents as advisors. A fNIRS study, Computers in Human Behavior: Artificial Humans 21, 100096.
Floridi, L., & Chiriatti, M. (2020). Gpt-3: Its nature, scope, limits, and consequences, Minds and Machines 301, 681–694.
Gabriel, I. (2020). Artificial intelligence, values, and alignment, Minds and machines 301, 411–437.
Gratch, J., & Fast, N. J. (2022). The power to harm: AI assistants pave the way to unethical behavior, Current Opinion in Psychology 471.
Gray, K., DiMaggio, N., Schein, C., & Kachanoff, F. (2023). The problem of purity in moral psychology, Personality and Social Psychology Review 271, 272–308.
Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fmri investigation of emotional engagement in moral judgment, Science 2931, 2105–2108.
Gutzwiller, R. S., Yousefi, R., Larson-Calcano, T., Lee, J. R., Verma, A., Tenhundfeld, N. L., & Maknojia, I. (2025). Systematic review of the use and modification of the “trust in automated systems scale”, in: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, SAGE Publications Sage CA: Los Angeles, CA, , p. 10711813251357911.
Haidt, J., & Joseph, C. (2007). The moral mind: How five sets of innate intuitions guide the development of many culture-specific virtues, and perhaps even modules, in: The Innate Mind, volume 3, pp. 367–391.
Hauptman, A. I., Schelble, B. G., & McNeese, N. J. (2022). Adaptive Autonomy as a Means for Implementing Shared Ethics in Human-AI Teams, Technical Report, Unpublished manuscript.
Hoff, K. A., & Bashir, M. (2015). Trust in automation: Integrating empirical evidence on factors that influence trust, Human Factors 571, 407–434.
Janoff-Bulman, R., Sheikh, S., & Hepp, S. (2009). Proscriptive versus prescriptive morality: Two faces of moral regulation, Journal of Personality and Social Psychology 961, 521–537.
Jiang, L., Hwang, J. D., Bhagavatula, C., Bras, R. L., Forbes, M., Borchardt, J., Liang, J., Etzioni, O., Sap, M., & Choi, Y. (2021). Delphi: Towards machine ethics and norms, arXiv preprint arXiv:2102.06724.
Jiang, L., Hwang, J. D., Bhagavatula, C., Bras, R. L., Liang, J. T., Levine, S., Dodge, J., Sakaguchi, K., Forbes, M., & Hessel, J., et al. (2025). Investigating machine moral judgement through the delphi experiment, Nature Machine Intelligence 71, 145–160.
Khavas, Z. Rezaei, Kotturu, M. R., Ahmadzadeh, S. R., & Robinette, P. (2024). Do humans trust robots that violate moral trust?, ACM Transactions on Human-Robot Interaction 131, 1–30.
Kim, B., Wen, R., Zhu, Q., Williams, T., & Phillips, E. (2021). Robots as moral advisors: The effects of deontological, virtue, and confucian role ethics on encouraging honest behavior, in: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, pp. 10–18.
Kim, B., Visser, E. de, & Phillips, E. (2022). Two uncanny valleys: Re-evaluating the uncanny valley across the full spectrum of real-world human-like robots, Computers in Human Behavior 1351.
Kim, B., Wen, R., Visser, E. J. de, Tossell, C. C., Zhu, Q., Williams, T., & Phillips, E. (2024). Can robot advisers encourage honesty?: Considering the impact of rule, identity, and role-based moral advice, International Journal of Human-Computer Studies 1841, 103217.
Kim, M. J., Shaw, T., Kim, B., & Phillips, E. (2025). Does presence and embodiment matter? investigating telepresence robots as a superior educational modality to videoconferencing, International Journal of Social Robotics 1–12.
Klinger, M., Burton, P., & Pitts, G. (2000). Mechanisms of unconscious priming: I. response competition, not spreading activation, Journal of Experimental Psychology: Learning, Memory, and Cognition 261, 441–455.
Knijnenburg, B. P., & Willemsen, M. C. (2016). Inferring capabilities of intelligent agents from their external traits, ACM Transactions on Interactive Intelligent Systems 61, 1–25.
Kohlberg, L., Levine, C., & Hewer, A. (1983). Moral stages: A current formulation and a response to critics, in: Contributions to Human Development, volume 10, pp. 174–174.
Kohn, S. C., Visser, E. J. de, Wiese, E., Lee, Y.-C., & Shaw, T. H. (2021). Measurement of trust in automation: A narrative review and reference guide, Frontiers in Psychology 121.
Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance, Human Factors 461, 50–80.
Liu, Y., Zhang, X. F., Wegsman, D., Beauchamp, N., & Wang, L. (2022). Politics: Pretraining with same-story article comparison for ideology prediction and stance detection, arXiv preprint arXiv:2205.00619.
Lucas, G. M., Gratch, J., King, A., & Morency, L.-P. (2014). It’s only a computer: Virtual humans increase willingness to disclose, Computers in Human Behavior 371, 94–100.
Lyons, J. B., & Guznov, S. Y. (2019). Individual differences in human-machine trust: A multi-study look at the perfect automation schema, Theoretical Issues in Ergonomics Science 201, 440–458.
Malle, B. F., & Scheutz, M. (2017). Moral competence in social robots, in: Machine Ethics and Robot Ethics, Routledge.
Malle, B. F., Rosen, E., Chi, V. B., Berg, M., & Haas, P. (2020). A general methodology for teaching norms to social robots, in: 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), IEEE, pp. 1395–1402.
Maninger, T., & Shank, D. B. (2022). Perceptions of violations by artificial and human actors across moral foundations, Computers in Human Behavior Reports 51, 100154.
Mayer, R. C., & Davis, J. H. (1999). The effect of the performance appraisal system on trust for management: A field quasi-experiment, Journal of Applied Psychology 841, 123–136.
Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An integrative model of organizational trust, Academy of Management Review 201, 709–734.
McVay, J., Visser, E. J. de, Pippin, B., Mani, A., Hyde, J. N., & Kman, N. (2025). Trust in aligned AIdecision makers, in: 2025 IEEE Conference on Artificial Intelligence (CAI), IEEE, pp. 1–4.
Merritt, S. M., Unnerstall, J. L., Lee, D., & Huber, K. (2015). Measuring individual differences in the perfect automation schema, Human factors 571, 740–753.
Momen, A., Visser, E. de, Wolsten, K., Cooley, K., Walliser, J., & Tossell, C. C. (2023). Trusting the moral judgments of a robot: Perceived moral competence and humanlikeness of a gpt-3 enabled ai, in: Proceedings of the 56th Hawaii International Conference on System Sciences, .
Momen, A., Visser, E. J. de, Fraune, M. R., Madison, A., Rueben, M., Cooley, K., & Tossell, C. C. (2023). Group trust dynamics during a risky driving experience in a tesla model x, Frontiers in psychology 141, 1129369.
Monfort, S. S., Graybeal, J. J., Harwood, A. E., McKnight, P. E., & Shaw, T. H. (2018). A single-item assessment for remaining mental resources: development and validation of the gas tank questionnaire (gtq), Theoretical Issues in Ergonomics Science 191, 530–552.
Mori, M., MacDorman, K. F., & Kageki, N. (2012). The uncanny valley [from the field], IEEE Robotics & automation magazine 191, 98–100.
National Security Commission on Artificial Intelligence, Final report (2021). [URL].
O’Neill, T., McNeese, N., Barron, A., & Schelble, B. G. (2022). Human-autonomy teaming: A review and analysis of the empirical literature, Human Factors 641, 904–938.
Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse, Human Factors 391, 230–253.
Phillips, E., Zhao, X., Ullman, D., & Malle, B. F. (2018). What is human-like?: Decomposing robots’ human-like appearance using the anthropomorphic robot(abot) database, in: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, pp. 105–113.
Raffard, S., Salesse, R. N., Marin, L., Del-Monte, J., Schmidt, R. C., Varlet, M., Bardy, B. G., Boulenger, J.-P., & Capdevielle, D. (2015). Social priming enhances interpersonal synchronization and feeling of connectedness towards schizophrenia patients, Scientific Reports 51, 1–10.
Roesler, E., Manzey, D., & Onnasch, L. (2021). A meta-analysis on the effectiveness of anthropomorphism in human-robot interaction, Science Robotics (Vol 6, Issue 58).
Schelble, B. G., Lopez, J., Textor, C., Zhang, R., McNeese, N. J., Pak, R., & Freeman, G. (2022). Towards ethical ai: Empirically investigating dimensions of AI ethics, trust repair, and performance in human-ai teaming, Human Factors.
Scheutz, M., Thielstrom, R., & Abrams, M. (2022). Transparency through explanations and justifications in human-robot task-based communications, International Journal of Human-Computer Interaction 381, 1739–1752.
Shariff, A. F., & Norenzayan, A. (2007). God is watching you: Priming god concepts increases prosocial behavior in an anonymous economic game, Psychological Science 181, 803–809.
Simmons, G. (2022). Moral mimicry: Large language models produce moral rationalizations tailored to political identity, arXiv preprint arXiv:2209.12106.
Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu, A., Kim, B., Love, B. C., Grant, E., Groen, I., & Achterberg, J., et al. (2023). Getting aligned on representational alignment, arXiv preprint arXiv:2310.13018.
Tenhundfeld, N., Demir, M., & Visser, E. de (2022). Assessment of trust in automation in the “real world”: Requirements for new trust in automation measurement techniques for use by practitioners, Journal of Cognitive Engineering and Decision Making 161, 101–118.
Textor, C., Zhang, R., Lopez, J., Schelble, B. G., McNeese, N. J., Freeman, G., Pak, R., Tossell, C., & Visser, E. J. de (2022). Exploring the relationship between ethics and trust in human-artificial intelligence teaming: A mixed methods approach, Journal of Cognitive Engineering and Decision Making 15553434221113964.
Time, Why the wga is striking for limits on the use of ai, Time.com, No date available.
Tossell, C. C., Tenhundfeld, N. L., Momen, A., Cooley, K., & Visser, E. J. De (2024). Student perceptions of chatgpt use in a college essay assignment: Implications for learning, grading, and trust in artificial intelligence, IEEE Transactions on Learning Technologies 171, 1069–1081.
Ullman, D., & Malle, B. F. (2019). Measuring gains and losses in human-robot trust: Evidence for differentiable components of trust, in: 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 618–619.
Ullman, D., Aladia, S., & Malle, B. F. (2021). Challenges and opportunities for replication science in hri: A case study in human-robot trust, in: Proceedings of the 2021 ACM/IEEE international conference on human-robot interaction, pp. 110–118.
Varanasi, L. (2023). Ai models like chatgpt and gpt-4 are acing everything from the bar exam to ap biology: Here’s a list of difficult exams both AI versions have passed. URL: [URL], accessed: 2025-08-04.
Visser, E. de, Cohen, M., Freedy, A., & Parasuraman, R. (2014). A design methodology for trust cue calibration in cognitive agents, in: Lecture Notes in Computer Science, volume 8525, pp. 262–271.
de Visser, E. J., Monfort, S., McKendrick, R., Smith, M., McKnight, P., Krueger, F., & Parasuraman, R. (2016). Almost human: Anthropomorphism increases trust resilience in cognitive agents, Journal of Experimental Psychology: Applied 221.
de Visser, E. J., Peeters, M. M. M., Jung, M., Kohn, S., Shaw, T., Pak, R., & Neerincx, M. (2020). Towards a theory of longitudinal trust calibration in human-robot teams, International Journal of Social Robotics 121.
Voiklis, J., Cusimano, C., & Malle, B. (2013). A social-conceptual map of moral criticism, in: Proceedings of the 7th International Conference on Social Robotics.
Wagner, A. R. (2020). Principles of evacuation robots, in: R. Pak, E. de Visser, E. Rovira (Eds.), Living with Robots, Academic Press, pp. 153–164.
Weisman, K., Dweck, C. S., & Markman, E. M. (2017). Rethinking people’s conceptions of mental life, Proceedings of the National Academy of Sciences 1141, 11374–11379.
Williams, T., Ayers, D., Kaufman, C., Serrano, J., & Roy, S. (2021). Deconstructed trustee theory: Disentangling trust in body and identity in multi-robot distributed systems, in: Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, pp. 262–271.
Winkle, K., Jackson, R. B., Brščić, D., Leite, I., Melsion, G. I., & Williams, T. (2022). Norm-breaking responses to sexist abuse: A cross-cultural human-robot interaction study, International Journal of Social Robotics.
