Cover not available

Article In: Group Dynamics in Human–Robot Interaction
Edited by Alessandra Sciutti, Dario Pasquali, Giulia Belgiovine and Linda Lastrico
[Interaction Studies 26:3] 2025
► pp. 422476

References (62)
References
Abdelrahman, A. A., Hempel, T., Khalifa, A., Al-Hamadi, A., & Dinges, L. (2023). L2cs-net : Finegrained gaze estimation in unconstrained environments. 2023 8th International Conference on Frontiers of Signal Processing (ICFSP), 98–102. Google Scholar logo with link to Google Scholar
Addlesee, A. (2024). Grounding LLMs to In-prompt Instructions: Reducing Hallucinations Caused by Static Pre-training Knowledge. 3rd Workshop on Safety for Conversational AI, Safety4ConvAI 2024 at LREC-COLING 2024 — Workshop Proceedings, 1–7.Google Scholar logo with link to Google Scholar
Addlesee, A., Sieińska, W., Gunson, N., Hernandez Garcia, D., Dondrup, C., & Lemon, O. (2023). Multi-party goal tracking with LLMs: Comparing pre-training, fine-tuning, and prompt engineering. In S. Stoyanchev, S. Joty, D. Schlangen, O. Dusek, C. Kennington, & M. Alikhani (Eds.), Proceedings of the 24th annual meeting of the special interest group on discourse and dialogue (pp. 229–241). Association for Computational Linguistics. Google Scholar logo with link to Google Scholar
Agrawal, G., Kumarage, T., Alghami, Z., & Liu, H. (2023). Can Knowledge Graphs Reduce Hallu-cinations in LLMs? : A Survey. [URL]
Allgeuer, P., Ali, H., & Wermter, S. (2024). When robots get chatty: Grounding multimodal humanrobot conversation and collaboration. Proceedings of the International Conference on Artificial Neural Networks. Google Scholar logo with link to Google Scholar
Aylett, M. P., & Romeo, M. (2023). You don’t need to speak, you need to listen: Robot interaction and human-like turn-taking. Proceedings of the 5th International Conference on Conversational User Interfaces. Google Scholar logo with link to Google Scholar
Bohus, D., & Horvitz, E. (2010). Facilitating multiparty dialog with gaze, gesture, and speech. International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction. Google Scholar logo with link to Google Scholar
(2011). Multiparty turn taking in situated dialog: Study, lessons, and directions. Proceedings of the SIGDIAL 2011 Conference, 98–109.Google Scholar logo with link to Google Scholar
Bommasani, R., Klyman, K., Longpre, S., Kapoor, S., Maslej, N., Xiong, B., Zhang, D., & Liang, P. (2023). The foundation model transparency index. [URL]
Chiang, W.-L., Li, Z., Lin, Z., Sheng, Y., Wu, Z., Zhang, H., Zheng, L., Zhuang, S., Zhuang, Y., Gonzalez, J. E., Stoica, I., & Xing, E. P. (2023). Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. [URL]
Cooper, S., Fava, A. D., Vivas, C., Marchionni, L., & Ferro, F. (2020). Ari: The social assistive robot and companion. 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 745–751. [URL].
Deng, J., Guo, J., Yuxiang, Z., Yu, J., Kotsia, I., & Zafeiriou, S. (2019). Retinaface: Single-stage dense face localisation in the wild. arxiv.Google Scholar logo with link to Google Scholar
Eisenberg, A., Gannot, S., & Chazan, S. E. (2023). A two-stage speaker extraction algorithm under adverse acoustic conditions using a single-microphone. 2023 31st European Signal Processing Conference (EUSIPCO), 266–270. Google Scholar logo with link to Google Scholar
Eshghi, A., & Healey, P. G. (2016). Collective contexts in conversation: Grounding by proxy. Cog-nitive science, 40 (2), 299–324. Google Scholar logo with link to Google Scholar
Fu, J., Ng, S.-K., Jiang, Z., & Liu, P. (2023). Gptscore: Evaluate as you desire. arXiv preprint arXiv:2302.04166.Google Scholar logo with link to Google Scholar
Ge, Z., Liu, S., Wang, F., Li, Z., & Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430.Google Scholar logo with link to Google Scholar
Gu, J.-C., Tan, C.-H., Tao, C., Ling, Z.-H., Hu, H., Geng, X., & Jiang, D. (2022). HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5086–5097. Google Scholar logo with link to Google Scholar
Gu, J.-C., Tao, C., & Ling, Z.-H. (2022). WHO Says WHAT to WHOM: A Survey of Multi-Party Conversations. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). Google Scholar logo with link to Google Scholar
Gu, J.-C., Tao, C., Ling, Z., Xu, C., Geng, X., & Jiang, D. (2021). MPC-BERT: A pre-trained language model for multi-party conversation understanding. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 3682–3692. Google Scholar logo with link to Google Scholar
Gunson, N., Addlesee, A., Hernandez Garcia, D., Romeo, M., Dondrup, C., & Lemon, O. (2024). A holistic evaluation methodology for multi-party spoken conversational agents. ACM International Conference on Intelligent Virtual Agents (IVA ’24). Google Scholar logo with link to Google Scholar
Hu, W., Chan, Z., Liu, B., Zhao, D., Ma, J., & Yan, R. (2019). GSN: A graph-structured network for multi-party dialogues. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). Google Scholar logo with link to Google Scholar
Incao, S., Mazzola, C., Belgiovine, G., & Sciutti, A. (2024). A roadmap for embodied and social grounding in llms. arXiv preprint arXiv:2409.16900.Google Scholar logo with link to Google Scholar
Iniguez-Carrillo, A. L., Gaytan-Lugo, L. S., Garcia-Ruiz, M. A., & Maciel-Arellano, R. (2021). Usability questionnaires to evaluate voice user interfaces. IEEE Latin America Transactions, 19 (9), 1468–1477. Google Scholar logo with link to Google Scholar
Jayagopi, D. B., & Odobez, J.-M. (2013). Given that, should i respond? contextual addressee estimation in multi-party human-robot interactions. Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction, 147–148.Google Scholar logo with link to Google Scholar
Ji, Z., Yu, T., Xu, Y., Lee, N., Ishii, E., & Fung, P. (2023). Towards Mitigating Hallucination in Large Language Models via Self-Reflection. EMNLP 2023, 1827–1843. [URL]
Jia, J., Komma, A., Leffel, T., Peng, X., Nagesh, A., Soliman, T., Galstyan, A., & Kumar, A. (2024). Leveraging LLMs for dialogue quality measurement. In Y. Yang, A. Davani, A. Sil, & A. Kumar (Eds.), Proceedings of the 2024 conference of the north american chapter of the association for computational linguistics: Human language technologies (volume 6: Industry track) (pp. 359–367). Association for Computational Linguistics. Google Scholar logo with link to Google Scholar
Johansson, M., & Skantze, G. (2015). Opportunities and obligations to take turns in collaborative multi-party human-robot interaction. Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 305–314. Google Scholar logo with link to Google Scholar
Johansson, M., Skantze, G., & Gustafson, J. (2014). Comparison of human-human and human-robot turn-taking behaviour in multiparty situated interaction. Proceedings of the 2014 Workshop on Understanding and Modeling Multiparty, Multimodal Interactions, 21–26. Google Scholar logo with link to Google Scholar
Kim, C. Y., Lee, C. P., & Mutlu, B. (2024). Understanding large-language model (LLM)-powered human-robot interaction. Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 371–380. Google Scholar logo with link to Google Scholar
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2023). Large language models are zero-shot reasoners.Google Scholar logo with link to Google Scholar
Lewis, J. R., & Hardzinski, M. L. (2015). Investigating the psychometric properties of the Speech User Interface Service Quality questionnaire. International Journal of Speech Technology, 18 (3), 479–487. Google Scholar logo with link to Google Scholar
Lin, T., Maire, M., Belongie, S. J., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft COCO: common objects in context. In D. J. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.), Computer vision — ECCV 2014 — 13th european conference, zurich, switzerland, september 6–12, 2014, proceedings, part V (pp. 740–755). Springer. Google Scholar logo with link to Google Scholar
Mahadevan, K., Chien, J., Brown, N., Xu, Z., Parada, C., Xia, F., Zeng, A., Takayama, L., & Sadigh, D. (2024). Generative expressive robot behaviors using large language models. Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 482–491. Google Scholar logo with link to Google Scholar
Mahajan, K., & Shaikh, S. (2021). On the need for thoughtful data collection for multi-party dialogue: A survey of available corpora and collection methods. In H. Li, G.-A. Levow, Z. Yu, C. Gupta, B. Sisman, S. Cai, D. Vandyke, N. Dethlefs, Y. Wu, & J. J. Li (Eds.), Proceedings of the 22nd annual meeting of the special interest group on discourse and dialogue (pp. 338352). Association for Computational Linguistics. Google Scholar logo with link to Google Scholar
Mazzola, C., Romeo, M., Rea, F., Sciutti, A., & Cangelosi, A. (2023). To whom are you talking? a deep learning model to endow social robots with addressee estimation skills. 2023 International Joint Conference on Neural Networks (IJCNN), 1–10. Google Scholar logo with link to Google Scholar
Mishra, C., & Skantze, G. (2022). Knowing where to look: A planning-based architecture to automate the gaze behavior of social robots. 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 1201–1208. Google Scholar logo with link to Google Scholar
Mittelstädt, J. M., Maier, J., Goerke, P., Zinn, F., & Hermes, M. (2024). Large language models can outperform humans in social situational judgments. Scientific Reports, 14 (1), 27449. Google Scholar logo with link to Google Scholar
Mohamed, Y., & Lemaignan, S. (2021). Ros for human-robot interaction. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 3020–3027. Google Scholar logo with link to Google Scholar
Murali, P., Steenstra, I., Yun, H. S., Shamekhi, A., & Bickmore, T. (2023). Improving multiparty interactions with a robot using large language models. Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. Google Scholar logo with link to Google Scholar
Novikova, J., Lemon, O., & Rieser, V. (2016). Crowd-sourcing NLG data: Pictures elicit better data. CoRR, abs/1608.00339. [URL].
Parada, C. (2024). What do foundation models have to do with and for hri? Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 21. Google Scholar logo with link to Google Scholar
Parreira, M. T., Gillet, S., Vazquez, M., & Leite, I. (2022). Design implications for effective robot gaze behaviors in multiparty interactions. Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, 976–980. Google Scholar logo with link to Google Scholar
Rawte, V., Sheth, A., & Das, A. (2023). A Survey of Hal lucination in Large Foundation Models. [URL]
Schlangen, D., & Skantze, G. (2009). A general, abstract model of incremental dialogue processing. In A. Lascarides, C. Gardent, & J. Nivre (Eds.), Proceedings of the 12th conference of the European chapter of the ACL (EACL 2009) (pp. 710–718). Association for Computational Linguistics. [URL].
Shriberg, E., Stolcke, A., Hakkani-Tiir, D., & Heck, L. P. (2012). Learning when to listen: Detecting system-addressed speech in human-human-computer dialog. Interspeech, 334–337. Google Scholar logo with link to Google Scholar
Skantze, G. (2021). Turn-taking in conversational systems and human-robot interaction: A review. Computer Speech & Language, 671, 101178. Google Scholar logo with link to Google Scholar
Skantze, G., Johansson, M., & Beskow, J. (2015). Exploring turn-taking cues in multi-party human-robot discussions about ob jects. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 67–74. Google Scholar logo with link to Google Scholar
Spatola, N., Kuihnlenz, B., & Cheng, G. (2021). Perception and Evaluation in Human-Robot In-teraction: The Human-Robot Interaction Evaluation Scale (HRIES) — A Multicomponent Approach of Anthropomorphism. International Journal of Social Robotics, 13(7), 1517–1539. Google Scholar logo with link to Google Scholar
Tan, C.-H., Gu, J.-C., & Ling, Z.-H. (2023). Is ChatGPT a good multi-party conversation solver? In H. Bouamor, J. Pino, & K. Bali (Eds.), Findings of the association for computational linguistics: Emnlp 2023 (pp. 4905–4915). Association for Computational Linguistics. Google Scholar logo with link to Google Scholar
Traum, D. (2004). Issues in multiparty dialogues. Advances in Agent Communication: International Workshop on Agent Communication Languages, ACL 2003, Melbourne, Australia, July 14, 2003. Revised and Invited Papers, 201–211. Google Scholar logo with link to Google Scholar
Tzinis, E., Wang, Z., Jiang, X., & Smaragdis, P. (2022). Compute and memory efficient universal sound source separation. J. Signal Process. Syst., 94 (2), 245–259. Google Scholar logo with link to Google Scholar
Wachowiak, L., Coles, A., Celiktutan, O., & Canal, G. (2024). Are large language models aligned with people’s social intuitions for human-robot interactions? 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2520–2527. Google Scholar logo with link to Google Scholar
Wagner, D., Churchill, A., Sigtia, S., Georgiou, P., Mirsamadi, M., Mishra, A., & Marchi, E. (2024). A multimodal approach to device-directed speech detection with large language models. ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 10451–10455. Google Scholar logo with link to Google Scholar
Wang, B., Chen, W., Pei, H., Xie, C., Kang, M., Zhang, C., Xu, C., Xiong, Z., Dutta, R., Schaeffer, R., Truong, S. T., Arora, S., Mazeika, M., Hendrycks, D., Lin, Z., Cheng, Y., Koyejo, S., Song, D., & Li, B. (2023). DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. (NeurIPS 2023). [URL]
Wang, C., Hasler, S., Tanneberg, D., Ocker, F., Joublin, F., Ceravola, A., Deigmoeller, J., & Gienger, M. (2024). Lami: Large language models for multi-modal human-robot interaction. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 1–10. Google Scholar logo with link to Google Scholar
Yang, C., Wang, X., Lu, Y., Liu, H., Le, Q. V., Zhou, D., & Chen, X. (2023). Large language models as optimizers.Google Scholar logo with link to Google Scholar
Zhang, C., Chen, J., Li, J., Peng, Y., & Mao, Z. (2023). Large language models for human-robot interaction: A review. Biomimetic Intelligence and Robotics, 3(4), 100131. Google Scholar logo with link to Google Scholar
Zhang, S., Dong, L., Li, X., Zhang, S., Sun, X., Wang, S., Li, J., Hu, R., Zhang, T., Wu, F., & Wang, G. (2023). Instruction tuning for large language models: A survey.Google Scholar logo with link to Google Scholar
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Byte-track: Multi-ob ject tracking by associating every detection box. Proceedings of the European Conference on Computer Vision (ECCV).Google Scholar logo with link to Google Scholar
Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E. P., Zhang, H., Gonzalez, J. E., & Stoica, I. (2024). Judging llm-as-a-judge with mt-bench and chatbot arena. Proceedings of the 37th International Conference on Neural Information Processing Systems.Google Scholar logo with link to Google Scholar
Zhong, M., Liu, Y., Xu, Y., Zhu, C., & Zeng, M. (2022). DialogLM: Pre-trained model for long dialogue understanding and summarization. Proceedings of the AAAI Conference on Artificial Intelligence, 361, 11765–11773. Google Scholar logo with link to Google Scholar
Zhong, Y., Xie, J., Wang, J., Fan, B., Fang, Z., & Peng, B. (2024). Improving large language models in multi-party conversations through role-playing. In D.-S. Huang, X. Zhang, & Q. Zhang (Eds.), Advanced intel ligent computing technology and applications (pp. 209–220). Springer Nature Singapore. Google Scholar logo with link to Google Scholar
Mobile Menu Logo with link to supplementary files background Layer 1 prag Twitter_Logo_Blue