Evaluation Of Medical Diagnosis Capabilities Of Three Artificial  Intelligence Models – ChatGPT-3.5, Google Gemini, Microsoft Copilot

Yordanka Eneva; Bora Doğan

Authors

Yordanka Eneva Medical University of Varna "Prof. Dr. Paraskev Stoyanov"
Bora Doğan Medical University of Varna "Prof. Dr. Paraskev Stoyanov"

Keywords:

Artificial Intelligence, Medical Diagnosis, ChatGPT-3.5, Microsoft Copilot, Google Gemini, Diagnostic Capabilities, Comparative Analysis

Abstract

The widespread adoption of artificial intelligence (AI) in various domains, including medicine,
has prompted extensive research into its diagnostic capabilities. This study conducts a comparative
analysis of three prominent AI models – ChatGPT-3.5, Microsoft Copilot, and Google Gemini – to
evaluate their performance in medical diagnosis. Clinical vignettes from Texas Tech University Health
Sciences Center were utilized to assess the accuracy and precision of the AI models in diagnosing internal
medicine cases. Results indicate that ChatGPT-3.5 achieved the highest accuracy rate, correctly
diagnosing 70.59% of cases, outperforming Google Gemini and Microsoft Copilot. While all models
demonstrated the potential to assist in diagnosis, variations in approach and performance were observed.
ChatGPT-3.5 provided concise answers without explicitly stating its lack of medical expertise, while
Google Gemini and Microsoft Copilot acknowledged their limitations but offered more detailed
explanations and recommendations. Statistical analysis, conducted using the chi-square test for
independence revealed significant differences in diagnostic capabilities among the AI models,
emphasizing the importance of careful selection in clinical decision-making. This study contributes
valuable insights into the application of AI in medical diagnosis and underscores the need for continued
refinement of AI models to enhance diagnostic accuracy and support healthcare professionals in
delivering optimal patient care.

Downloads

Download data is not yet available.

Author Biographies

Yordanka Eneva, Medical University of Varna "Prof. Dr. Paraskev Stoyanov"

Department of Physics and Biophysics, Faculty of Pharmacy, Bulgaria

Bora Doğan, Medical University of Varna "Prof. Dr. Paraskev Stoyanov"

Department of Physics and Biophysics, Faculty of Pharmacy, Bulgaria

References

Tse Chiang Chen, Emily Kaminski at all, „Chat GPT as a Neuro-Score Calculator: Analysis of a Large Language Model’s Performance on Various Neurological Exam Grading Scales“, World Neurosurgery, Vol. 179, 2023, Pages e342-e347

Hongyan Wang, WeiZhen Wu, at all, “Performance and exploration of ChatGPT in medical examination, records and education in Chinese: Pave the way for medical AI”, International Journal of Medical Informatics, Volume 177, September 2023, 105173

Patel, V.; Shah, M. A comprehensive study on artificial intelligence and machine learning in drug discovery and drug development. Intell. Med. 2021. [Google Scholar] [CrossRef]

Nakamura, T.; Sasano, T. Artificial intelligence and cardiology: Current status and perspective. J. Cardiol. 2022, 79, 326–333. [Google Scholar] [CrossRef]

Muthalaly, R.G.; Evans, R.M. Applications of Machine Learning in Cardiac Electrophysiology. Arrhythm Electrophysiol. Rev. 2020, 9, 71–77. [Google Scholar] [CrossRef] [PubMed]

Asha, P.; Srivani, P.; Ahmed, A.A.A.; Kolhe, A.; Nomani, M.Z.M. Artificial intelligence in medical Imaging: An analysis of innovative technique and its future promise. Mater. Today Proc. 2021, 56, 2236–2239. [Google Scholar] [CrossRef]

Yao, L.; Zhang, H.; Zhang, M.; Chen, X.; Zhang, J.; Huang, J.; Zhang, L. Application of artificial intelligence in renal disease. Clin. Ehealth 2021, 4, 54–61. [Google Scholar] [CrossRef]

Van den Eynde, J.; Lachmann, M.; Laugwitz, K.-L.; Manlhiot, C.; Kutty, S. Successfully Implemented Artificial Intelligence and Machine Learning Applications In Cardiology: State-of-the-Art Review. Trends Cardiovasc. Med. 2022. [Google Scholar] [CrossRef] [PubMed]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin, „Attention is all you need“, „Advances in neural information processing systems“, 2017.

“Introducing ChatGPT.” Accessed: Feb. 24, 2024. [Online]. Available: https://openai.com/blog/chatgpt

“Gemini - Google DeepMind.” Accessed: Feb. 24, 2024. [Online]. Available: https://deepmind.google/technologies/gemini/#introduction

“Your Everyday AI Companion | Microsoft Bing.” Accessed: Feb. 24, 2024. [Online]. Available: https://www.microsoft.com/en-us/bing

Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT—reshaping medical education and clinical management. Pak J Med Sci. 2023; 39(2): 605-607. . [Google Scholar] [Web of Science] [PubMed]

Amann J at all, “To explain or not to explain?-Artificial intelligence explainability in clinical decision support systems” PLOS Digit Health. 2022 Feb 17;1(2):e0000016. doi: 10.1371/journal.pdig.0000016. PMID: 36812545; PMCID: PMC9931364.

Vasey B et all, “Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI”, Nature Medicine, (2022) 28, pages924–933

J. Agrimor et al., Interesting Clinical Vignettes: 101 Ice Breakers for Medical Rounds. Texas Tech University Health Sciences Center (TTUHSC). Accessed: Feb. 09, 2024. [Online]. Available: https://www.ttuhsc.edu/clinical-research/documents/JMD-Cases-of-Interest.pdf

Evaluation Of Medical Diagnosis Capabilities Of Three Artificial Intelligence Models – ChatGPT-3.5, Google Gemini, Microsoft Copilot

Authors

Keywords:

Abstract

Downloads

Author Biographies

Yordanka Eneva, Medical University of Varna "Prof. Dr. Paraskev Stoyanov"

Bora Doğan, Medical University of Varna "Prof. Dr. Paraskev Stoyanov"

References

Downloads

Published

How to Cite

Issue

Section

Keywords

Information

Current Issue