Article published In: Future Perspectives in Medical Argumentation
Edited by Sarah Bigi and Maria Grazia Rossi
[Journal of Argumentation in Context 14:3] 2025
► pp. 321–365
Large Language Models, argumentation, and healthcare
A socio-cognitive perspective
Published online: 4 December 2025
https://doi.org/10.1075/jaic.25024.pag
https://doi.org/10.1075/jaic.25024.pag
Abstract
There are various ways in which Large Language Models (LLMs), the latest breakthrough in Artificial Intelligence,
are relevant for medicine: this paper focuses on their potential for supporting and improving argumentation in healthcare, both
for patients and for practitioners. The message is mostly positive, suggesting adoption of such systems, but with specific
cautions: most notably, the need to leverage them for enhancing human communicative and epistemic performance, rather than
replacing it, and the importance of training users on few key principles to guide their deployment of LLMs in healthcare. The
paper is accompanied by four concrete use cases, included in the supplementary materials, that constitute an integral and
crucial part of this contribution.
Article outline
- 1.Introduction
- 2.Misplaced fears of AI for argumentation in healthcare
- 3.How to improve patients’ information search and argumentative experience with LLMs
- 4.How to hone argumentative skills in healthcare professionals using LLMs
- 5.Conclusions
- Notes
- Supplementary materials
- Case study 1. Patient’s use of LLMs to collect healthcare information on a mundane problem
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- Case study 2. Patient’s use of LLMs to collect healthcare information on a potentially serious problem
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- Case study 3. Healthcare worker’s use of LLMs to review and improve their argumentative strategies with patients
- User
- ChatGPT 4o-mini
- Case study 4. Healthcare worker’s use of LLMs to roleplay an argumentative interaction with a problematic patient and receive
feedback on performance
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- User
- ChatGPT 4o-mini
- Case study 1. Patient’s use of LLMs to collect healthcare information on a mundane problem
References
References (70)
Abd-Alrazaq, A., AlSaad, R., Alhuwail, D., Ahmed, A., Healy, P. M., Latifi, S., Aziz, S., Damseh, R., Alrazak, S. A., & Sheikh, J. 2023. Large
language models in medical education: opportunities, challenges, and future directions. JMIR
Medical
Education, 9(1), e48291.
Acemoglu, D. 2002. Technical
change, inequality, and the labor market. Journal of Economic
Literature, 40(1), 7–72.
Acemoglu, D., & Restrepo, P. 2018. The
race between man and machine: implications of technology for growth, factor shares, and
employment. American Economic
Review, 108(6), 1488–1542.
2019. Automation
and new tasks: How technology displaces and reinstates labor. Journal of Economic
Perspectives, 33(2), 3–30.
Ahmad, S. F., Han, H., Alam, M. M., Rehmat, M., Irshad, M., Arraño-Muñoz, M., & Ariza-Montes, A. 2023. Impact
of artificial intelligence on human loss in decision making, laziness and safety in
education. Humanities and Social Sciences
Communications, 10(1), 1–14.
Angelov, P. P., Soares, E. A., Jiang, R., Arnold, N. I., & Atkinson, P. M. 2021. Explainable
artificial intelligence: an analytical review. Wiley Interdisciplinary Reviews: Data Mining and
Knowledge
Discovery, 11(5), e1424.
Arkoudas, K. 2023. ChatGPT
is no stochastic parrot. But it also claims that 1 is greater than 1. Philosophy &
Technology, 36(3), 54.
Ayers, J. W., Poliak, A., Dredze, M., Leas, E. C., Zhu, Z., Kelley, J. B., Faix, D. J., Goodman, A. M., Longhurst, C. A., Hogarth, M., & Smith, D. M. 2023. Comparing
physician and artificial intelligence chatbot responses to patient questions posted to a public social media
forum. JAMA Internal
Medicine, 183(6), 589–596.
Back, A. L., Fromme, E. K., & Meier, D. E. 2019. Training
clinicians with communication skills needed to match medical treatments to patient
values. Journal of the American Geriatrics
Society, 67(S2), S435–S441.
Becker, G., Kempf, D. E., Xander, C. J., Momm, F., Olschewski, M., & Blum, H. E. 2010. Four
minutes for a patient, twenty seconds for a relative — an observational study at a university
hospital. BMC Health Services
Research, 101, 94.
Bedi, S., Liu, Y., Orr-Ewing, L., Dash, D., Koyejo, S., Callahan, A., … & Shah, N. H. 2025. Testing
and evaluation of health care applications of large language models: a systematic
review. JAMA, 333(4), 319–328.
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. 2021. On
the dangers of stochastic parrots: Can language models be too
big?🦜. In Proceedings of the 2021 ACM conference on fairness,
accountability, and
transparency (pp. 610–623). ACM.
Bex, F., Grasso, F., Green, N., Paglieri, F., & Reed, C. (Eds.) 2017. Argument
technologies: theory, analysis, and applications. College Publications.
Broom, A. 2005. Virtually
he@ lthy: the impact of internet use on disease experience and the doctor-patient
relationship. Qualitative Health
Research, 15(3), 325–345.
Capraro, V., Lentsch, A., Acemoglu, D., Akgun, S., Akhmedova, A., Bilancini, E., … & Viale, R. 2024. The
impact of generative artificial intelligence on socioeconomic inequalities and policy
making. PNAS
Nexus, 3(6), pgae 191.
Caroprese, L., Vocaturo, E., & Zumpano, E. 2022. Argumentation
approaches for explainable AI in medical informatics. Intelligent Systems with
Applications, 161, 200109.
Cassinadri, G. 2024. ChatGPT
and the technology-education tension: Applying contextual virtue epistemology to a cognitive
artifact. Philosophy &
Technology, 37(1), 14.
Chiu, Y. C. 2011. Probing,
impelling, but not offending doctors: the role of the internet as an information source for patients’ interactions with
doctors. Qualitative Health
Research, 21(12), 1658–1666.
Clusmann, J., Kolbinger, F. R., Muti, H. S., Carrero, Z. I., Eckardt, J. N., Laleh, N. G., … & Kather, J. N. 2023. The
future landscape of large language models in medicine. Communications
Medicine, 3(1), 141.
Costello, T. H., Pennycook, G., & Rand, D. G. 2024. Durably
reducing conspiracy beliefs through dialogues with
AI. Science, 385(6714), eadq1814.
Čyras, K., Rago, A., Albini, E., Baroni, P., & Toni, F. 2021. Argumentative
XAI: a survey. arXiv preprint arXiv:2105.11266.
Dong, M., Conway, J. R., Bonnefon, J.-F., Shariff, A., & Rahwan, I. 2024. Fears
about artificial intelligence across 20 countries and six domains of application. American
Psychologist, advance online publication.
Fanous, A., Goldberg, J., Agarwal, A. A., Lin, J., Zhou, A., Daneshjou, R., & Koyejo, S. 2025. SycEval:
Evaluating LLM Sycophancy. arXiv preprint arXiv:2502.08177.
Ghassemi, M., Oakden-Rayner, L., & Beam, A. L. 2021. The
false hope of current approaches to explainable artificial intelligence in health care. The
Lancet Digital
Health, 3(11), e745–e750.
Gottlieb, M., & Dyer, S. 2020. Information
and disinformation: social media in the COVID-19 crisis. Academic Emergency
Medicine, 27(7), 640–641.
Granek, L., Krzyzanowska, M. K., Tozer, R., & Mazzotta, P. 2013. Oncologists’
strategies and barriers to effective communication about the end of life. Journal of Oncology
Practice, 9(4), e129–e135.
Hanemaayer, A. 2021. Don’t
touch my stuff: historicising resistance to AI and algorithmic computer technologies in
medicine. Interdisciplinary Science
Reviews, 46(1–2), 126–137.
Hart, A., Henwood, F., & Wyatt, S. 2004. The
role of the Internet in patient-practitioner relationships: findings from a qualitative research
study. Journal of Medical Internet
Research, 6(3), e50.
Higgins, T. S., Wu, A. W., Sharma, D., Illing, E. A., Rubel, K., Ting, J. Y., & Snot
Force Alliance. 2020. Correlations of online search
engine trends with coronavirus disease (COVID-19) incidence: infodemiology study. JMIR Public
Health and
Surveillance, 6(2), e19702.
Jussupow, E., Spohrer, K., & Heinzl, A. 2022. Identity
threats as a reason for resistance to artificial intelligence: survey study with medical students and
professionals. JMIR Formative
Research, 6(3), e28750.
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., … & Kasneci, G. 2023. ChatGPT
for good? On opportunities and challenges of large language models for education. Learning and
Individual Differences, 1031, 102274.
Knutzen, K. E., Sacks, O. A., Brody-Bizar, O. C., Murray, G. F., Jain, R. H., Holdcroft, L. A., … & Barnato, A. E. 2021. Actual
and missed opportunities for end-of-life care discussions with oncology patients: a qualitative
study. JAMA Network
Open, 4(6), e2113193.
Kökciyan, N., Sassoon, I. K., Young, A. P., Chapman, M. D., Porat, T. R., Ashworth, M., Curcin, V., Modgil, S., Parsons, S. D., & Sklar, E. I. 2018. Towards
an argumentation system for supporting patients in self-managing their chronic
conditions. In AAAI Joint Workshop on Health
Intelligence (W3PHIAI
2018, pp. 6–13), AAAI.
Kreps, S., McCain, R. M., & Brundage, M. 2022. All
the news that’s fit to fabricate: AI-generated text as a tool of media misinformation. Journal
of Experimental Political
Science, 9(1), 104–117.
Kurtz, S. M. 2002. Doctor-patient
communication: principles and practices. Canadian Journal of Neurological Sciences,
29(S2), S23–S29.
Lampos, V., Majumder, M. S., Yom-Tov, E., Edelstein, M., Moura, S., Hamada, Y., Rangaka, M. X., McKendry, R. A., & Cox, I. J. 2021. Tracking
COVID-19 using online search. NPJ Digital
Medicine, 4(1), 17.
Lawrence, J., & Reed, C. 2020. Argument
mining: a survey. Computational
Linguistics, 45(4), 765–818.
Liu, X., Glocker, B., McCradden, M. M., Ghassemi, M., Denniston, A. K., & Oakden-Rayner, L. 2022. The
medical algorithmic audit. The Lancet Digital
Health, 4(5), e384–e397.
Longoni, C., Bonezzi, A., & Morewedge, C. K. 2019. Resistance
to medical Artificial Intelligence. Journal of Consumer
Research, 46(4), 629–650.
Maaz, S., Palaganas, J. C., Palaganas, G., & Bajwa, M. 2025. A
guide to prompt design: foundations and applications for healthcare simulationists. Frontiers
in Medicine, 111, 1504532.
Maleki, N., Padmanabhan, B., & Dutta, K. 2024. AI
hallucinations: a misnomer worth clarifying. In Proceedings of the
2024 IEEE Conference on Artificial Intelligence
(CAI) (pp. 133–138). IEEE.
Mayer, T., Cabrio, E., & Villata, S. 2020. Transformer-based
argument mining for healthcare applications. In G. De Giacomo, A. Catala, B. Dilkina, M. Milano, S. Barro, A. Bugarín, & J. Lang (Eds.), Proceedings
of the 24th European Conference on Artificial Intelligence, ECAI
2020 (pp. 2108–2115). IOS Press.
Menz, B. D., Modi, N. D., Sorich, M. J., & Hopkins, A. M. 2024. Health
disinformation use case highlighting the urgent need for artificial intelligence vigilance: weapons of mass
disinformation. JAMA Internal
Medicine, 184(1), 92–96.
Meskó, B., & Topol, E. J. 2023. The
imperative for regulatory oversight of large language models (or generative AI) in
healthcare. NPJ Digital
Medicine, 6(1), 120.
Minh, D., Wang, H. X., Li, Y. F., & Nguyen, T. N. 2022. Explainable
artificial intelligence: a comprehensive review. Artificial Intelligence
Review, 551, 3503–3568.
Moreira, M. W., Rodrigues, J. J., Korotaev, V., Al-Muhtadi, J., & Kumar, N. 2019. A
comprehensive review on smart decision support systems for health care. IEEE Systems
Journal, 13(3), 3536–3545.
Musi, E., Kökciyan, N., Al Khatib, K., Ceolin, D., Dietz, E., Gutekunst, K., Hautli-Janisz, A., Santibáñez, C., Schneider, J., Scholz, J., Steging, C., Visser, J., & Wachsmuth, H. 2025. Toward
reasonable parrots: why Large Language Models should argue with us by
design. In Proceedings of the 12th Argument Mining
Workshop (pp. 24–31), ACL.
Noy, S., & Zhang, W. 2023. Experimental
evidence on the productivity effects of generative artificial
intelligence. Science, 381(6654), 187–192.
Paglieri, F. 2024. Expropriated
minds: on some practical problems of generative AI, beyond our cognitive illusions. Philosophy
&
Technology, 37(2), 1–30.
Piantadosi, S. T., & Hill, F. 2022. Meaning
without reference in large language models. arXiv preprint
arXiv:2208.02957.
Rajpurkar, P., Chen, E., Banerjee, O., & Topol, E. J. 2022. AI
in health and medicine. Nature
Medicine, 28(1), 31–38.
Ray, P. P. 2024. Can
LLMs improve existing scenario of healthcare?. Journal of
Hepatology, 80(1), e28–e29.
Rolnick, D., Donti, P. L., Kaack, L. H., Kochanski, K., Lacoste, A., Sankaran, K., … & Bengio, Y. 2022. Tackling
climate change with machine learning. ACM Computing Surveys
(CSUR), 55(2), 42.
Silver, M. P. 2015. Patient
perspectives on online health information and communication with doctors: a qualitative study of patients 50 years old and
over. Journal of Medical Internet
Research, 17(1), e3588.
Song, X., Xu, B., & Zhao, Z. 2022. Can
people experience romantic love for artificial intelligence? An empirical study of intelligent
assistants. Information &
Management, 59(2), 103595.
Stevenson, F. A., Kerr, C., Murray, E., & Nazareth, I. 2007. Information
from the Internet and the doctor-patient relationship: the patient perspective–a qualitative
study. BMC Family
Practice, 81, 47.
Swire-Thompson, B., & Lazer, D. 2020. Public
health and online misinformation: challenges and recommendations. Annual Review of Public
Health, 41(1), 433–451.
Tan, S. S. L., & Goonawardene, N. 2017. Internet
health information seeking and the patient-physician relationship: a systematic review. Journal
of Medical Internet
Research, 19(1), e9.
Tayebi Arasteh, S., Han, T., Lotfinia, M., Kuhl, C., Kather, J. N., Truhn, D., & Nebelung, S. 2024. Large
language models streamline automated machine learning for clinical studies. Nature
Communications, 15(1), 1603.
Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K., Gutierrez, L., Tan, T. F., & Ting, D. S. W. 2023. Large
language models in medicine. Nature
Medicine, 29(8), 1930–1940.
Topol, E. J. 2019. High-performance
medicine: the convergence of human and artificial intelligence. Nature
Medicine, 25(1), 44–56.
Vassiliades, A., Bassiliades, N., & Patkos, T. 2021. Argumentation
and explainable artificial intelligence: a survey. The Knowledge Engineering
Review, 361, e5.
Walters, W. H., & Wilder, E. I. 2023. Fabrication
and errors in the bibliographic citations generated by ChatGPT. Scientific
Reports, 13(1), 14045.
Wang, D., & Zhang, S. 2024. Large
language models in medical and healthcare fields: applications, advances, and
challenges. Artificial Intelligence
Review, 57(11), 299.
Wang, L., Wan, Z., Ni, C., Song, Q., Li, Y., Clayton, E., Malin, B., & Yin, Z. 2024. Applications
and concerns of ChatGPT and other conversational Large Language Models in health care: systematic
review. Journal of Medical Internet
Research, 261, e22769.
Yang, Y., & Ngai, E. W. 2024. The
influence of physician stance on patient resistance to healthcare
AI. In Proceedings of AMCIS
2024 (vol. 41, pp. 2656–2660). Association for Information Systems (AIS).
